Job Market Intelligence Platform

A comprehensive platform for job market intelligence with integrated AI-powered insights. Built with modular architecture for extensibility and maintainability.

🏗️ Architecture Overview

job-market-intelligence/
├── ai-analyzer/              # Shared core utilities (logger, AI, location, text) + CLI tool
├── linkedin-parser/          # LinkedIn-specific scraper with integrated AI analysis
├── job-search-parser/        # Job search intelligence
└── docs/                    # Documentation

🚀 Quick Start

Prerequisites

  • Node.js 18+
  • Playwright browser automation
  • LinkedIn account credentials
  • Optional: Ollama for local AI analysis

Installation

npm install
npx playwright install chromium

Basic Usage

# Run LinkedIn parser with integrated AI analysis
cd linkedin-parser && npm start

# Run LinkedIn parser with specific keywords
cd linkedin-parser && npm run start:custom

# Run LinkedIn parser without AI analysis
cd linkedin-parser && npm run start:no-ai

# Run job search parser
cd job-search-parser && npm start

# Analyze existing results with AI (CLI)
cd linkedin-parser && npm run analyze:latest

# Analyze with custom context
cd linkedin-parser && npm run analyze:layoff

# Run demo workflow
node demo.js

📦 Core Components

1. AI Analyzer (ai-analyzer/)

Shared utilities and CLI tool used by all parsers

  • Logger: Consistent logging across all components
  • Text Processing: Keyword matching, text cleaning
  • Location Validation: Geographic filtering and validation
  • AI Integration: Local Ollama support with integrated analysis
  • CLI Tool: Command-line interface for standalone AI analysis
  • Test Utilities: Shared testing helpers

Key Features:

  • Configurable log levels with color support
  • Intelligent text processing and keyword matching
  • Geographic location validation against filters
  • Integrated AI analysis: AI results embedded in data structure
  • CLI tool: Standalone analysis with flexible options
  • Comprehensive test coverage

2. LinkedIn Parser (linkedin-parser/)

Specialized LinkedIn content scraper with integrated AI analysis

  • Automated LinkedIn login and navigation
  • Keyword-based post searching
  • Profile location validation
  • Duplicate detection and filtering
  • Automatic AI analysis integrated into results
  • Configurable search parameters

Key Features:

  • Browser automation with Playwright
  • Geographic filtering by city/region
  • Date range filtering (24h, week, month)
  • Integrated AI-powered content relevance analysis
  • Single JSON output with embedded AI insights
  • Two output files: results (with AI) and rejected posts

3. Job Search Parser (job-search-parser/)

Job market intelligence and analysis

  • Job posting aggregation
  • Role-specific keyword tracking
  • Market trend analysis
  • Salary and requirement insights

Key Features:

  • Tech role keyword tracking
  • Industry-specific analysis
  • Market demand insights
  • Competitive intelligence

4. AI Analysis CLI (ai-analyzer/cli.js)

Command-line tool for AI analysis of any results JSON file

  • Analyze any results JSON file from LinkedIn parser or other sources
  • Integrated analysis: AI results embedded back into original JSON
  • Custom analysis context and AI models
  • Comprehensive analysis summary and statistics
  • Flexible input format support

Key Features:

  • Works with any JSON results file
  • Integrated output: AI analysis embedded in original structure
  • Custom analysis contexts
  • Detailed relevance scoring
  • Confidence level analysis
  • Summary statistics and insights

🔧 Configuration

Environment Variables

Create a .env file in the root directory:

# LinkedIn Credentials
LINKEDIN_USERNAME=your_email@example.com
LINKEDIN_PASSWORD=your_password

# Search Configuration
CITY=Toronto
DATE_POSTED=past-week
SORT_BY=date_posted
WHEELS=5

# Location Filtering
LOCATION_FILTER=Ontario,Manitoba
ENABLE_LOCATION_CHECK=true

# AI Analysis
ENABLE_AI_ANALYSIS=true
AI_CONTEXT="job market analysis and trends"
OLLAMA_MODEL=mistral

# Keywords
KEYWORDS=keywords-layoff.csv

Command Line Options

# LinkedIn Parser Options
--headless=true|false         # Browser headless mode
--keyword="kw1,kw2"          # Specific keywords
--add-keyword="kw1,kw2"      # Additional keywords
--no-location                # Disable location filtering
--no-ai                      # Disable AI analysis

# Job Search Parser Options
--help                       # Show parser-specific help

# AI Analysis CLI Options
--input=FILE                 # Input JSON file
--output=FILE                # Output file
--context="description"      # Custom AI analysis context
--model=MODEL                # Ollama model
--latest                     # Use latest results file
--dir=PATH                   # Directory to look for results

📊 Output Formats

LinkedIn Parser Output

The LinkedIn parser now generates two main files with integrated AI analysis:

1. Main Results with AI Analysis (linkedin-results-YYYY-MM-DD-HH-MM.json)

{
  "metadata": {
    "timestamp": "2024-01-15T10:30:00Z",
    "totalPosts": 45,
    "rejectedPosts": 12,
    "aiAnalysisEnabled": true,
    "aiAnalysisCompleted": true,
    "aiContext": "job market analysis and trends",
    "aiModel": "mistral",
    "locationFilter": "Ontario,Manitoba"
  },
  "results": [
    {
      "keyword": "layoff",
      "text": "Cleaned post content...",
      "profileLink": "https://linkedin.com/in/johndoe",
      "location": "Toronto, Ontario, Canada",
      "locationValid": true,
      "locationMatchedFilter": "Ontario",
      "locationReasoning": "Location matches filter",
      "timestamp": "2024-01-15T10:30:00Z",
      "source": "linkedin",
      "parser": "linkedout-parser",
      "aiAnalysis": {
        "isRelevant": true,
        "confidence": 0.9,
        "reasoning": "Post discusses job market conditions and layoffs",
        "context": "job market analysis and trends",
        "model": "mistral",
        "analyzedAt": "2024-01-15T10:30:00Z"
      }
    }
  ]
}

2. Rejected Posts (linkedin-rejected-YYYY-MM-DD-HH-MM.json)

[
  {
    "rejected": true,
    "reason": "Location filter failed: Location not in filter",
    "keyword": "layoff",
    "text": "Post content...",
    "profileLink": "https://linkedin.com/in/janedoe",
    "location": "Vancouver, BC, Canada",
    "timestamp": "2024-01-15T10:30:00Z"
  }
]

AI Analysis CLI Output

The CLI tool creates integrated results with AI analysis embedded:

Re-analyzed Results (original-filename-ai.json)

{
  "metadata": {
    "timestamp": "2024-01-15T10:30:00Z",
    "totalPosts": 45,
    "aiAnalysisUpdated": "2024-01-15T11:00:00Z",
    "aiContext": "layoff analysis",
    "aiModel": "mistral"
  },
  "results": [
    {
      "keyword": "layoff",
      "text": "Post content...",
      "profileLink": "https://linkedin.com/in/johndoe",
      "location": "Toronto, Ontario, Canada",
      "aiAnalysis": {
        "isRelevant": true,
        "confidence": 0.9,
        "reasoning": "Post mentions layoffs and workforce reduction",
        "context": "layoff analysis",
        "model": "mistral",
        "analyzedAt": "2024-01-15T11:00:00Z"
      }
    }
  ]
}

🧪 Testing

Run All Tests

npm test

Run Specific Test Suites

# AI Analyzer tests
cd ai-analyzer && npm test

# LinkedIn Parser tests
cd linkedin-parser && npm test

# Job Search Parser tests
cd job-search-parser && npm test

Security Best Practices

  • Store credentials in .env file (never commit)
  • Use environment variables for sensitive data
  • Implement rate limiting to avoid detection
  • Respect LinkedIn's Terms of Service
  • Educational/research purposes only
  • Respect rate limits and usage policies
  • Monitor LinkedIn ToS changes
  • Implement data retention policies

🚀 Advanced Features

AI-Powered Analysis

  • Local AI: Ollama integration for privacy
  • Integrated Analysis: AI results embedded in data structure
  • Automatic Analysis: Runs after parsing completes
  • Context Analysis: Relevance scoring
  • Confidence Scoring: AI confidence levels for each post
  • CLI Tool: Standalone analysis with flexible options

Geographic Intelligence

  • Location Validation: Profile location verification
  • Regional Filtering: City/state/country filtering
  • Geographic Analysis: Location-based insights

Data Processing

  • Duplicate Detection: Intelligent deduplication
  • Content Cleaning: Remove hashtags, URLs, emojis
  • Metadata Extraction: Author, engagement, timing data
  • Integrated AI: AI insights embedded in each result

📈 Performance Optimization

  • Headless Mode: Faster execution
  • Location Filtering: Reduces false positives
  • AI Analysis: Improves result quality (enabled by default)
  • Batch Processing: Efficient data handling

Monitoring

  • Real-time progress indicators
  • Detailed logging with configurable levels
  • Performance metrics tracking
  • Error handling and recovery

🤝 Contributing

Development Setup

  1. Fork the repository
  2. Create feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit pull request

Code Standards

  • Follow existing code style
  • Add JSDoc comments
  • Maintain test coverage
  • Update documentation

📄 License

This project is for educational and research purposes. Please respect LinkedIn's Terms of Service and use responsibly.

🆘 Support

Common Issues

  • Browser Issues: Ensure Playwright is installed
  • Login Problems: Check credentials in .env
  • Rate Limiting: Implement delays between requests
  • Location Filtering: Verify location filter format
  • AI Analysis: Ensure Ollama is running for AI features

Getting Help

  • Check the component-specific READMEs
  • Review the demo files for examples
  • Examine the test files for usage patterns
  • Open an issue with detailed error information

🆕 What's New

  • Integrated AI Analysis: AI results are now embedded directly in the results JSON
  • No Separate Files: No more separate AI analysis files to manage
  • CLI Tool: Standalone AI analysis with flexible options
  • Rich Context: Each post includes detailed AI insights
  • Flexible Re-analysis: Easy to re-analyze with different contexts
  • Backward Compatible: Original data structure preserved

Note: This tool is designed for educational and research purposes. Always respect LinkedIn's Terms of Service and implement appropriate rate limiting and ethical usage practices.

Description
No description provided
Readme 462 KiB
Languages
JavaScript 100%