The Scheduler Loop: Job Queues and PDF Downloads

The Scheduler Loop: Job Queues and PDF Downloads

So I've got 466 FIIs (Brazilian real estate funds) seeded in a local test database.

ResumoFII's purpose is to help rapidly summarize each fund's monthly report. The reports get published on B3's site, and we can download them automatically. The idea is to automate the retrieval and summarization of the reports and eliminate the need to manually fetch and read each one every month.

The pipeline I built this week follows the Clean Architecture principles I explained in Building ResumoFII. We need to pull a PDF from B3, run it through Gemini, and store the resulting JSON summary.

The Core Loop

Job Scheduler

I set up a scheduler that runs a check for pending summarization jobs every 30 seconds. It uses a semaphore to limit concurrency to two jobs at a time. When a permit is available, the job gets picked up and a new task is spawned to execute it. Both the duration of the interval and concurrency limit are configurable.

Each job starts by checking if a cached PDF for the report already exists. If not, it downloads the PDF from B3 so we can run it through Gemini.

The task executor has a pdf_report_gateway that abstracts the logic to find and download the correct report from B3. The concrete type implements the full retrieval flow and downloads the PDF if available. Depending on success or failure, the job is marked completed or failed. The semaphore permit is then released, allowing the next job to run.

What's Next

So far I've got job creation, queueing, B3 querying, PDF I/O, and job status updates all working. I can now focus on sending the downloaded PDF to Gemini for summarization and validating the end-to-end flow.

Once that's in place, the next step will be to spin up a VPS and backfill several months of data for all funds to start collecting feedback from real users.

Subscribe to popado

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe