How to Use OpenAI API to Analyze Defense Contractor Earnings Reports

Transform dense quarterly earnings reports into actionable insights by building an automated analysis system using OpenAI's GPT-4 API. This tutorial takes

Defense contractors just reported Q3 earnings, and if you're still reading 80-page 10-Q filings by hand, you're already three days behind the algorithms. While institutional investors deploy teams of analysts to extract contract values and margin trends, individual investors can now build their own automated analysis system using OpenAI's GPT-4 API — and do it in under 30 minutes.

What You Will Build

Automated contract award extraction that identifies new programs worth $100M+ before they hit headlines
Margin trend analysis across aerospace vs. defense segments with quarter-over-quarter comparisons
Risk factor monitoring that flags cost overruns and program delays in real-time

Why This Approach Works When Manual Analysis Fails

Here's what most coverage misses about defense contractor analysis: the signal isn't in the executive summary or press release. It's buried in Management's Discussion and Analysis sections, where companies are legally required to disclose contract modifications, program risks, and forward guidance that directly impact stock prices.

Traditional financial analysis tools can't parse these dense regulatory filings. Bloomberg terminals cost $24,000 annually and still require manual extraction. But GPT-4's latest models can process structured financial text with 85%+ accuracy on standardized metrics like contract backlog and operating margins — the same benchmarks professional analysts track.

The real advantage? Speed. While human analysts spend 4-6 hours per quarterly filing, this automated approach processes the same information in 15-20 minutes at a cost of roughly $3-5 per analysis.

What You'll Need to Get Started

OpenAI API account with at least $20 credit (GPT-4 costs ~$0.03 per 1K tokens)
Python 3.8+ with pip package manager installed
Required libraries: openai, PyPDF2, pandas, requests
Text editor or IDE: VS Code, PyCharm, or Jupyter Notebook
Basic Python knowledge: variables, functions, and API calls

Time estimate: 45-60 minutes | Difficulty: Intermediate

Setting Up Your Analysis Pipeline

Secure API Access and Prevent Runaway Costs

Navigate to platform.openai.com and create an account. Click "API Keys" in the left sidebar, then "Create new secret key". Copy the key immediately — OpenAI shows it only once.

Before writing any code, set a hard spending limit. Go to "Settings" > "Billing" > "Usage limits" and configure a $25 monthly cap. Defense contractor 10-Q reports average 50-80 pages, and processing costs can escalate quickly if your script hits an infinite loop.

Store your API key securely in a .env file:

OPENAI_API_KEY=sk-your-actual-key-here

Download and Prepare Target Filings

Visit investor relations pages of major defense contractors. For Lockheed Martin, navigate to their financial reporting section. Download the most recent Form 10-Q as PDF.

Focus on these high-signal sections: "Management's Discussion and Analysis", "Segment Results", and "Contract Backlog". These contain the contract modifications, margin data, and forward-looking statements that move stock prices after earnings calls.

Build the PDF Processing Foundation

Install required libraries and create your text extraction engine:

pip install openai PyPDF2 pandas python-dotenv requests

The key challenge with 10-Q filings is their length. GPT-4 has token limits, so you need intelligent chunking that preserves context across page breaks:


import PyPDF2
import os
from dotenv import load_dotenv

def extract_pdf_text(pdf_path):
    with open(pdf_path, 'rb') as file:
        pdf_reader = PyPDF2.PdfReader(file)
        text_chunks = []
        
        for page_num, page in enumerate(pdf_reader.pages):
            page_text = page.extract_text()
            # Overlap chunks to preserve cross-page context
            if len(page_text) > 2000:
                chunks = [page_text[i:i+2000] for i in range(0, len(page_text), 1800)]
                text_chunks.extend(chunks)
            else:
                text_chunks.append(page_text)
    
    return text_chunks

The 200-character overlap strategy ensures GPT-4 doesn't miss contract references that span multiple pages — critical for defense programs that often reference multi-year awards.

black flat screen computer monitor — Photo by KOBU Agency / Unsplash

Creating Defense-Specific Analysis Prompts

Generic financial analysis prompts fail on defense contractors because they miss industry-specific metrics. Wall Street analysts track different KPIs for aerospace companies: contract backlog duration, program milestone achievements, and government customer concentration.

Design your prompt to extract exactly what matters:


DEFENSE_ANALYSIS_PROMPT = """
You are a defense industry equity analyst. Extract key metrics from this 10-Q section:

1. CONTRACT VALUES: New awards, modifications, backlog changes with dollar amounts
2. MARGIN TRENDS: Operating margins by segment (aerospace vs. defense vs. technologies)
3. PROGRAM RISKS: Cost overruns, schedule delays, government budget concerns
4. REVENUE GUIDANCE: Forward-looking statements about upcoming quarters

Return as valid JSON with these keys:
{
    "contract_awards": [],
    "margin_data": {},
    "risk_factors": [],
    "revenue_outlook": ""
}

Text: {chunk_text}
"""

This structured approach yields consistent output format across different report sections. The JSON structure makes aggregation and Excel export straightforward.

Processing Text Through GPT-4

Build the core analysis function with proper error handling and rate limiting:


import openai
import json
import time

load_dotenv()
openai.api_key = os.getenv('OPENAI_API_KEY')

def analyze_chunk(chunk_text):
    try:
        response = openai.ChatCompletion.create(
            model="gpt-4",
            messages=[{
                "role": "user", 
                "content": DEFENSE_ANALYSIS_PROMPT.format(chunk_text=chunk_text)
            }],
            temperature=0.2,  # Consistent financial analysis
            max_tokens=1000
        )
        
        time.sleep(20)  # Respect 3.5 RPM rate limit
        return response.choices[0].message.content
        
    except Exception as e:
        print(f"API Error: {e}")
        return None

The temperature=0.2 setting is crucial. You want consistent financial analysis, not creative interpretation. The 20-second delay keeps you well under OpenAI's rate limits — defense analysis isn't time-critical, accuracy matters more.

Aggregating Results Into Actionable Intelligence

Raw GPT-4 outputs need intelligent aggregation to become useful investment insights:


def aggregate_analysis_results(analysis_results):
    combined = {
        "contract_awards": [],
        "margin_data": {},
        "risk_factors": [],
        "revenue_outlook": []
    }
    
    for result in analysis_results:
        try:
            parsed = json.loads(result)
            combined["contract_awards"].extend(parsed.get("contract_awards", []))
            combined["risk_factors"].extend(parsed.get("risk_factors", []))
            
            if parsed.get("revenue_outlook"):
                combined["revenue_outlook"].append(parsed["revenue_outlook"])
            
            # Merge segment margin data
            for segment, margin in parsed.get("margin_data", {}).items():
                combined["margin_data"][segment] = margin
                
        except json.JSONDecodeError:
            continue
    
    return combined

This aggregation step transforms 80 pages of regulatory text into structured data that directly maps to the metrics institutional investors track: contract pipeline strength, margin trajectory, and program execution risk.

Export Professional-Grade Analysis

Create Excel outputs that investment teams can use immediately:


import pandas as pd
from datetime import datetime

def export_to_excel(analysis_data, company_name, quarter):
    filename = f"defense_analysis_{company_name}_{quarter}_{datetime.now().strftime('%Y%m%d')}.xlsx"
    
    with pd.ExcelWriter(filename, engine='xlsxwriter') as writer:
        # Contract Awards Sheet
        if analysis_data["contract_awards"]:
            awards_df = pd.DataFrame(analysis_data["contract_awards"])
            awards_df.to_excel(writer, sheet_name='Contract_Awards', index=False)
        
        # Risk Assessment Sheet
        if analysis_data["risk_factors"]:
            risks_df = pd.DataFrame({"Risk_Factor": analysis_data["risk_factors"]})
            risks_df.to_excel(writer, sheet_name='Risk_Assessment', index=False)
        
        # Forward Guidance Sheet
        outlook_df = pd.DataFrame({"Guidance": analysis_data["revenue_outlook"]})
        outlook_df.to_excel(writer, sheet_name='Forward_Guidance', index=False)
    
    print(f"Analysis exported to {filename}")

The multi-sheet format matches professional equity research workflows. Analysts can quickly pivot between contract details, risk factors, and management guidance without hunting through regulatory text.

Running Your Complete Analysis

Combine all components into the master pipeline:


def analyze_defense_contractor_report(pdf_path, company_name, quarter):
    print(f"Processing {company_name} {quarter} 10-Q...")
    
    text_chunks = extract_pdf_text(pdf_path)
    print(f"Extracted {len(text_chunks)} chunks")
    
    analysis_results = []
    for i, chunk in enumerate(text_chunks):
        print(f"Analyzing chunk {i+1}/{len(text_chunks)}")
        result = analyze_chunk(chunk)
        if result:
            analysis_results.append(result)
    
    final_analysis = aggregate_analysis_results(analysis_results)
    export_to_excel(final_analysis, company_name, quarter)
    
    return final_analysis

# Execute analysis
if __name__ == "__main__":
    result = analyze_defense_contractor_report(
        "reports/lockheed_martin_10q_q3_2026.pdf", 
        "Lockheed_Martin", 
        "Q3_2026"
    )

When Things Go Wrong

Rate limit errors (429 status): Increase sleep delays to 30 seconds. OpenAI's GPT-4 has strict limits, especially for new accounts.

Garbled PDF text extraction: Some 10-Q filings use image-based formatting. Install pytesseract for OCR processing: pip install pytesseract

JSON parsing failures: GPT-4 occasionally returns malformed responses. Add validation and retry logic with explicit instructions: "Return ONLY valid JSON, no explanatory text."

The deeper issue here isn't technical — it's that 10-Q filings aren't designed for automated analysis. Companies use inconsistent formatting, and the SEC doesn't require structured data tags for contract information. Your system needs to be robust to these variations.

Scaling Beyond Single Company Analysis

Individual company analysis is just the beginning. The real value emerges when you process multiple contractors simultaneously and identify sector-wide trends that individual company analysis misses.

Set up automated monitoring that triggers analysis on earnings release dates — defense contractors typically report within 2-3 days of each other. Use GPT-3.5-turbo for initial filtering, then GPT-4 only for sections containing financial data. This approach cuts costs by 90% while maintaining analytical accuracy on the metrics that matter.

Build comparative dashboards that track contract backlog changes exceeding 10% quarter-over-quarter across the sector. As our previous analysis of AI defense contract economics revealed, Pentagon spending patterns create predictable opportunities for investors who can process information faster than traditional research methods.

The question isn't whether AI will transform financial analysis — it's whether you'll be processing next quarter's earnings reports while your competition is still reading this quarter's.