Alloovium

Projects

Document Types

Alloovium automatically classifies documents to optimise analysis. Learn what types are supported and how the processing pipeline works.

Document Types

Alloovium automatically classifies uploaded documents into categories to help with organisation and to optimise analysis. Common types include contracts, specifications, drawings, reports, and correspondence. You can also assign a custom type to any document.

TypeDescriptionBest used for
ContractLegal agreements and subcontractsPayment terms, obligations, liquidated damages
SpecificationTechnical and performance specsScope verification, compliance checking
DrawingEngineering and architectural drawingsClash detection, revision tracking
ReportSite reports, inspection reportsIssue tracking, defect management
CorrespondenceEmails, RFIs, instructionsNotice tracking, variation history

Processing Pipeline

Understanding how Alloovium processes your documents helps you get the most out of the platform. Each document goes through the following stages:

  1. Text extractionAlloovium reads the raw text from each page. For scanned PDFs and images, GPU-accelerated OCR is applied automatically.
  2. Layout analysisThe document structure is analysed: headings, tables, figures, and paragraph blocks are identified and tagged.
  3. ChunkingThe document is split into semantically meaningful segments (typically 200–500 tokens each) that are small enough for precise retrieval.
  4. EmbeddingEach chunk is converted into a vector embedding using a high-dimensional model trained on technical and legal language.
  5. IndexingEmbeddings are stored in a vector index for fast similarity search, enabling sub-second retrieval during AI queries.

GPU extraction

Alloovium uses GPU-accelerated document extraction for complex PDFs and engineering drawings. This means higher accuracy on scanned documents, tables with merged cells, and multi-column layouts.