linkedout/ai-analyzer
Tanya 691d61aaee Refactor text utilities for improved clarity and maintainability
- Cleaned up and organized text processing utilities in `text-utils.js` for better readability and reuse.
- Ensured consistent formatting and documentation across utility functions.
- No functional changes were made; the focus was on code structure and clarity.
2025-12-29 11:22:59 -05:00
..

AI Analyzer - Core Utilities Package

Shared utilities and core functionality used by all LinkedOut parsers. This package provides consistent logging, text processing, location validation, AI integration, and a command-line interface for AI analysis.

🎯 Purpose

The AI Analyzer serves as the foundation for all LinkedOut components, providing:

  • Consistent Logging: Unified logging system across all parsers
  • Text Processing: Keyword matching, content cleaning, and analysis
  • Location Validation: Geographic filtering and location intelligence
  • AI Integration: Local Ollama support with integrated analysis
  • CLI Tool: Command-line interface for standalone AI analysis
  • Test Utilities: Shared testing helpers and mocks

📦 Components

1. Logger (src/logger.js)

Configurable logging system with color support and level controls.

const { logger } = require("ai-analyzer");

// Basic logging
logger.info("Processing started");
logger.warning("Rate limit approaching");
logger.error("Connection failed");

// Convenience methods with emoji prefixes
logger.step("🚀 Starting scrape");
logger.search("🔍 Searching for keywords");
logger.ai("🧠 Running AI analysis");
logger.location("📍 Validating location");
logger.file("📄 Saving results");

Features:

  • Configurable log levels (debug, info, warning, error, success)
  • Color-coded output with chalk
  • Emoji prefixes for better UX
  • Silent mode for production
  • Timestamp formatting

2. Text Utilities (src/text-utils.js)

Text processing and keyword matching utilities.

const { cleanText, containsAnyKeyword } = require("ai-analyzer");

// Clean text content
const cleaned = cleanText(
  "Check out this #awesome post! https://example.com 🚀"
);
// Result: "Check out this awesome post!"

// Check for keyword matches
const keywords = ["layoff", "downsizing", "RIF"];
const hasMatch = containsAnyKeyword(text, keywords);

Features:

  • Remove hashtags, URLs, and emojis
  • Case-insensitive keyword matching
  • Multiple keyword detection
  • Text normalization

3. Location Utilities (src/location-utils.js)

Geographic location validation and filtering.

const {
  parseLocationFilters,
  validateLocationAgainstFilters,
  extractLocationFromProfile,
} = require("ai-analyzer");

// Parse location filter string
const filters = parseLocationFilters("Ontario,Manitoba,Toronto");

// Validate location against filters
const isValid = validateLocationAgainstFilters(
  "Toronto, Ontario, Canada",
  filters
);

// Extract location from profile text
const location = extractLocationFromProfile(
  "Software Engineer at Tech Corp • Toronto, Ontario"
);

Features:

  • Geographic filter parsing
  • Location validation against 200+ Canadian cities
  • Profile location extraction
  • Smart location matching

4. AI Utilities (src/ai-utils.js)

AI-powered content analysis with integrated results.

const { analyzeBatch, checkOllamaStatus } = require("ai-analyzer");

// Check AI availability
const aiAvailable = await checkOllamaStatus("mistral");

// Analyze posts with AI (returns analysis results)
const analysis = await analyzeBatch(posts, "job market analysis", "mistral");

// Integrate AI analysis into results
const resultsWithAI = posts.map((post, index) => ({
  ...post,
  aiAnalysis: {
    isRelevant: analysis[index].isRelevant,
    confidence: analysis[index].confidence,
    reasoning: analysis[index].reasoning,
    context: "job market analysis",
    model: "mistral",
    analyzedAt: new Date().toISOString(),
  },
}));

Features:

  • Ollama integration for local AI
  • Batch processing for efficiency
  • Confidence scoring
  • Context-aware analysis
  • Integrated results: AI analysis embedded in data structure

5. CLI Tool (cli.js)

Command-line interface for standalone AI analysis.

# Analyze latest results file
node cli.js --latest --dir=results

# Analyze specific file
node cli.js --input=results.json

# Analyze with custom context
node cli.js --input=results.json --context="layoff analysis"

# Analyze with different model
node cli.js --input=results.json --model=mistral

# Show help
node cli.js --help

Features:

  • Integrated Analysis: AI results embedded back into original JSON
  • Flexible Input: Support for various JSON formats
  • Context Switching: Easy re-analysis with different contexts
  • Model Selection: Choose different Ollama models
  • Directory Support: Specify results directory with --dir

6. Test Utilities (src/test-utils.js)

Shared testing helpers and mocks.

const { createMockPost, createMockProfile } = require("ai-analyzer");

// Create test data
const mockPost = createMockPost({
  content: "Test post content",
  author: "John Doe",
  location: "Toronto, Ontario",
});

🚀 Installation

# Install dependencies
npm install

# Run tests
npm test

# Run specific test suites
npm test -- --testNamePattern="Logger"

📋 CLI Reference

Basic Usage

# Analyze latest results file
node cli.js --latest --dir=results

# Analyze specific file
node cli.js --input=results.json

# Analyze with custom output
node cli.js --input=results.json --output=analysis.json

Options

--input=FILE              # Input JSON file
--output=FILE             # Output file (default: original-ai.json)
--context="description"   # Analysis context (default: "job market analysis and trends")
--model=MODEL             # Ollama model (default: mistral)
--latest                  # Use latest results file from directory
--dir=PATH                # Directory to look for results (default: 'results')
--help, -h                # Show help

Examples

# Analyze latest LinkedIn results
cd linkedin-parser
node ../ai-analyzer/cli.js --latest --dir=results

# Analyze with layoff context
node cli.js --input=results.json --context="layoff analysis"

# Analyze with different model
node cli.js --input=results.json --model=llama3

# Analyze from project root
node ai-analyzer/cli.js --latest --dir=linkedin-parser/results

Output Format

The CLI integrates AI analysis directly into the original JSON structure:

{
  "metadata": {
    "timestamp": "2025-07-21T02:00:08.561Z",
    "totalPosts": 10,
    "aiAnalysisUpdated": "2025-07-21T02:48:42.487Z",
    "aiContext": "job market analysis and trends",
    "aiModel": "mistral"
  },
  "results": [
    {
      "keyword": "layoff",
      "text": "Post content...",
      "aiAnalysis": {
        "isRelevant": true,
        "confidence": 0.9,
        "reasoning": "Post discusses job market conditions",
        "context": "job market analysis and trends",
        "model": "mistral",
        "analyzedAt": "2025-07-21T02:48:42.487Z"
      }
    }
  ]
}

📋 API Reference

Logger Class

const { Logger } = require("ai-analyzer");

// Create custom logger
const logger = new Logger({
  debug: false,
  colors: true,
});

// Configure levels
logger.setLevel("debug", true);
logger.silent(); // Disable all logging
logger.verbose(); // Enable all logging

Text Processing

const { cleanText, containsAnyKeyword } = require('ai-analyzer');

// Clean text
cleanText(text: string): string

// Check keywords
containsAnyKeyword(text: string, keywords: string[]): boolean

Location Validation

const {
  parseLocationFilters,
  validateLocationAgainstFilters,
  extractLocationFromProfile
} = require('ai-analyzer');

// Parse filters
parseLocationFilters(filterString: string): string[]

// Validate location
validateLocationAgainstFilters(location: string, filters: string[]): boolean

// Extract from profile
extractLocationFromProfile(profileText: string): string | null

AI Analysis

const { analyzeBatch, checkOllamaStatus, findLatestResultsFile } = require('ai-analyzer');

// Check AI availability
checkOllamaStatus(model?: string, ollamaHost?: string): Promise<boolean>

// Analyze posts
analyzeBatch(posts: Post[], context: string, model?: string): Promise<AnalysisResult[]>

// Find latest results file
findLatestResultsFile(resultsDir?: string): string

🧪 Testing

Run All Tests

npm test

Test Coverage

npm run test:coverage

Specific Test Suites

# Logger tests
npm test -- --testNamePattern="Logger"

# Text utilities tests
npm test -- --testNamePattern="Text"

# Location utilities tests
npm test -- --testNamePattern="Location"

# AI utilities tests
npm test -- --testNamePattern="AI"

🔧 Configuration

Environment Variables

# AI Configuration
OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=mistral
AI_CONTEXT="job market analysis and trends"

# Logging Configuration
LOG_LEVEL=info
LOG_COLORS=true

# Location Configuration
LOCATION_FILTER=Ontario,Manitoba
ENABLE_LOCATION_CHECK=true

Logger Configuration

const logger = new Logger({
  debug: true, // Enable debug logging
  info: true, // Enable info logging
  warning: true, // Enable warning logging
  error: true, // Enable error logging
  success: true, // Enable success logging
  colors: true, // Enable color output
});

📊 Usage Examples

Basic Logging Setup

const { logger } = require("ai-analyzer");

// Configure for production
if (process.env.NODE_ENV === "production") {
  logger.setLevel("debug", false);
  logger.setLevel("info", true);
}

// Use throughout your application
logger.step("Starting LinkedIn scrape");
logger.info("Found 150 posts");
logger.warning("Rate limit approaching");
logger.success("Scraping completed successfully");

Text Processing Pipeline

const { cleanText, containsAnyKeyword } = require("ai-analyzer");

function processPost(post) {
  // Clean the content
  const cleanedContent = cleanText(post.content);

  // Check for keywords
  const keywords = ["layoff", "downsizing", "RIF"];
  const hasKeywords = containsAnyKeyword(cleanedContent, keywords);

  return {
    ...post,
    cleanedContent,
    hasKeywords,
  };
}

Location Validation

const {
  parseLocationFilters,
  validateLocationAgainstFilters,
} = require("ai-analyzer");

// Setup location filtering
const locationFilters = parseLocationFilters("Ontario,Manitoba,Toronto");

// Validate each post
function validatePost(post) {
  const isValidLocation = validateLocationAgainstFilters(
    post.author.location,
    locationFilters
  );

  return isValidLocation ? post : null;
}

AI Analysis Integration

const { analyzeBatch, checkOllamaStatus } = require("ai-analyzer");

async function analyzePosts(posts) {
  try {
    // Check AI availability
    const aiAvailable = await checkOllamaStatus("mistral");
    if (!aiAvailable) {
      logger.warning("AI not available - skipping analysis");
      return posts;
    }

    // Run AI analysis
    const analysis = await analyzeBatch(
      posts,
      "job market analysis",
      "mistral"
    );

    // Integrate AI analysis into results
    const resultsWithAI = posts.map((post, index) => ({
      ...post,
      aiAnalysis: {
        isRelevant: analysis[index].isRelevant,
        confidence: analysis[index].confidence,
        reasoning: analysis[index].reasoning,
        context: "job market analysis",
        model: "mistral",
        analyzedAt: new Date().toISOString(),
      },
    }));

    return resultsWithAI;
  } catch (error) {
    logger.error("AI analysis failed:", error.message);
    return posts; // Return original posts if AI fails
  }
}

CLI Integration

// In your parser's package.json scripts
{
  "scripts": {
    "analyze:latest": "node ../ai-analyzer/cli.js --latest --dir=results",
    "analyze:layoff": "node ../ai-analyzer/cli.js --latest --dir=results --context=\"layoff analysis\"",
    "analyze:trends": "node ../ai-analyzer/cli.js --latest --dir=results --context=\"job market trends\""
  }
}

🔒 Security & Best Practices

Credential Management

  • Store API keys in environment variables
  • Never commit sensitive data to version control
  • Use .env files for local development

Rate Limiting

  • Implement delays between AI API calls
  • Respect service provider rate limits
  • Use batch processing to minimize requests

Error Handling

  • Always wrap AI calls in try-catch blocks
  • Provide fallback behavior when services fail
  • Log errors with appropriate detail levels

🤝 Contributing

Development Setup

  1. Fork the repository
  2. Create feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit pull request

Code Standards

  • Follow existing code style
  • Add JSDoc comments for all functions
  • Maintain test coverage above 90%
  • Update documentation for new features

📄 License

This package is part of the LinkedOut platform and follows the same licensing terms.


Note: This package is designed to be used as a dependency by other LinkedOut components. It provides the core utilities, CLI tool, and should not be used standalone.