System Architecture
fossabot's mission is to manage your dependency updates in an intelligent and highly automated fashion.
Data Sources
To do this, it requires access to 3 fundamental data sources:
1. Your application (source code)
Fetched from supported version control systems like GitHub or GitLab.
| Product | Support Status |
|---|---|
| GitHub.com | Supported |
| GitHub Enterprise Cloud | Supported |
| GitHub Enterprise Cloud with Data Residency | Supported |
| GitHub Enterprise Server | Supported |
| GitLab.com | Supported |
| GitLab Dedicated | Supported |
| GitLab Self-Hosted | Supported |
2. Your third party dependencies (source code)
Package manager databases are used to connect the declared packages in your app with their published source code. Publicly-available dependency source code is fetched from GitHub, GitLab and BitBucket. Private dependencies are fetched from configured code mirrors.
3. Hosted AI models
fossabot exclusively uses LLM and AI services from Anthropic through an enterprise agreement with no training or data retention. A variety of models are used for specific uses – a full list is available upon request.
First Party Code
Your first party code is cloned down during each operation. Each analysis worker is dedicated to a single customer's analysis at a time and your cloned code is deleted immediately after the analysis completes.
Data Flow Diagram
A large majority of fossabot workflows are focused on collecting and analyzing open-source, thirty party code. The bulk of the collection is done through HTTP APIs and git clone type operations. AI web search is used to provide realtime hints about bugs and malware that might not be reflected in release notes and change logs.
First-party code analysis is first done locally through static analysis tools, and secondarily using AI to understand relevant call sites.
When "acting on analysis", code snippets are communicated to AI services in order to assess impact of changes in the third party code or adapt your call sites to mitigate breaking changes.

Interaction between the 3 categories of subsystems
Context Building
fossabot's handling of your code is split into security boundaries.

Isolation between different analysis steps
Direct Access to your code is when a copy of code exists on disk. Once the code is processed and filtered, subsequent steps only have Indirect Access, which is metadata and small snippets of code derived from that earlier direct access. This focuses the AI models on extremely specific context.
Analysis steps that Collect execute AI web search and run agentic processing with the most minimal set of context possible to accomplish their goals.
Code Generation is required to be fully agentic, but the generation is limited to specific call sites. This happens on a small subset of code changes through @fossabot fix.
Communication with third party systems never execute AI. Their only job is to send or receive pre-constructed data, which prevents credential commingling.
Updated 14 days ago
