Back to Projects
Awesome Lung CT Datasets
Medical AIOpen SourceResearch ToolsFeaturedSeptember 2025

Awesome Lung CT Datasets

A curated, open-source collection of 15+ publicly available lung CT datasets with segmentation annotations — designed to help researchers quickly find the right data for medical image analysis.

Medical ImagingCT ScansOpen SourceGitHub

Why This Project?

If you've ever worked on a medical imaging research project, you know the pain: finding the right public dataset is harder than it should be. Datasets are scattered across TCIA, Zenodo, Grand Challenge, Mendeley, and random university pages. Licenses vary, formats differ, and annotations come in all shapes.

I built Awesome Lung CT Datasets to solve this problem — a single, well-organized reference for every publicly available lung CT dataset with segmentation annotations. Whether you're working on nodule detection, COVID-19 lesion segmentation, airway extraction, or multi-organ segmentation, this repository gets you to the data in seconds.

What's Inside

The repository currently indexes 15+ datasets covering 100,000+ CT scans across five categories.

Scan Count Distribution

The following chart shows the number of scans available in each dataset. Note the logarithmic scale — NLST alone contains over 75,000 scans, while specialized datasets like AeroPath focus on quality with 27 carefully annotated volumes.

Scan count

General Lung Segmentation

DatasetScansAnnotationsFormat
LIDC-IDRI1,018Nodule annotations by 4 radiologistsDICOM + XML
LUNA16888Lung nodules ≥3mmDICOM
NSCLC Radiogenomics211Tumor segmentations + genomic dataDICOM
Medical Decathlon Lung96Lung tumor segmentationsNIfTI

COVID-19 Lung CT

DatasetScansAnnotationsFormat
COVID-19 CT Seg Dataset 1100 slicesGround-glass, consolidation, effusionNIfTI
COVID-19 CT Seg Dataset 2829 slicesCOVID-19 lesionsNIfTI
COVID-19 CT Lung & Infection20 volumesLeft/right lung + infectionsNIfTI

Thoracic Organ Segmentation

DatasetScansAnnotationsFormat
SegTHOR60Heart, trachea, aorta, esophagusNIfTI
TotalSegmentator1,228117+ anatomical structuresNIfTI
LCTSC 201760Lung, heart, spinal cord, esophagusDICOM

Airway Segmentation

DatasetScansAnnotationsFormat
ATM'22500Full airway tree + centerlinesNIfTI
AeroPath'2327Trachea + bronchi (challenging pathology)NIfTI
AIIB23285ILD masks + airway-informed biomarkersNIfTI

Specialized Datasets

DatasetScansAnnotationsFormat
LUNG-PET-CT-DX355Tumor bounding boxesDICOM + XML
NLST75,000+Low-dose screening scansDICOM

Annotation Coverage

The datasets span multiple annotation types essential for different research tasks. Pixel-level segmentation dominates, but the collection also covers bounding boxes, multi-organ labels, airway trees with centerlines, and multi-radiologist consensus annotations.

Annotation types

  • Pixel-level segmentation: LIDC-IDRI, COVID-19 datasets, TotalSegmentator
  • Bounding boxes: LUNG-PET-CT-DX
  • Multi-organ labels: SegTHOR, TotalSegmentator (117+ structures)
  • Multi-radiologist consensus: LIDC-IDRI (4 expert annotations per nodule)
  • Airway trees with centerlines: ATM'22

Research Impact

Based on publication counts, the datasets in this collection have been used in thousands of research papers. LIDC-IDRI alone appears in over 1,000 publications, making it the most widely used lung CT dataset in the world.

Publication impact

  1. LIDC-IDRI — 1,000+ publications
  2. LUNA16 — 500+ publications
  3. Medical Decathlon — 200+ publications
  4. COVID-19 Segmentation — 150+ publications

Who Is This For?

  • PhD students starting a new lung imaging project and need data fast
  • Researchers benchmarking their segmentation models across multiple datasets
  • Engineers building medical AI products and looking for training data
  • Radiologists interested in open datasets for validation studies

Contributing

The repository is open source under the MIT license. If you know of a public lung CT dataset that's missing, you can submit a pull request to add it. Contributions welcome!


Check out the full repository on GitHub — and star it if it helps your research!