Abstract
This technical report details the architecture and implementation of the BSIC Quant Library, a Python-based infrastructure designed to centralize and optimize financial data retrieval for the BSIC members. The system is designed to facilitate access to financial data from the association’s members by providing a robust pipeline for data crawling, storage, and retrieval.
The library utilizes a technology stack featuring DuckDB for fast querying, AWS S3 for scalable cloud storage, and Apache Parquet for efficient columnar data formatting. Key architectural features include a secure authentication mechanism via AWS SSO, strict environment isolation (Development vs. Production), and a flexible crawling template.
By distinguishing storage (S3) from compute (DuckDB), the library enables members to query massive financial datasets efficiently without local storage overhead, while providing developers with a structured CI/CD workflow for data ingestion. This infrastructure serves as the backbone for future quantitative research in BSIC, and it will be upgraded with the addition of many new types of financial assets.
0 Comments