| | Titles | Links |
|---|
| Computer Science | Natural Language Processing (NLP) | https://rajpurkar.github.io/SQuAD-explorer/ |
| Computer Vision | https://cocodataset.org/#download |
| Algorithms & Data Structures | https://snap.stanford.edu/data/ |
| Programming Languages / Code Analysis | https://github.com/github/CodeSearchNet |
| Operating Systems | https://github.com/google/cluster-data |
| Databases & Data Mining | https://datasets.imdbws.com/ |
| Computer Architecture / Hardware | https://github.com/felixsteinke/cpu-spec-dataset |
| Information Technology | Cloud Computing | https://github.com/google/cluster-data |
| Software Engineering | https://www.kaggle.com/datasets/syedmharis/software-engineering-interview-questions-dataset |
| IT Service Management | https://www.kaggle.com/datasets/swapniljadhav96/itsm-dataset |
| Cybersecurity | https://www.kaggle.com/datasets/teamincribo/cyber-security-attacks |
| User Behavior / Web Analytics | https://archive.org/details/datasets |
| Electrical Engineering | Power Systems | https://ieee-dataport.org/documents/power-system-multi-source-events-dataset |
| Renewable Energy (Solar/Wind) | https://www.nrel.gov/grid/solar-power-data.html |
| Smart Grid | https://www.kaggle.com/datasets/ziya07/smart-grid-monitoring-dataset/data |
| Electrical Machines | https://ieee-dataport.org/open-access/industrial-machines-dataset-electrical-load-disaggregation |
| Control Systems | https://ieee-dataport.org/documents/dataset-bundle-building-automation-and-control-systems-security-analysis# |
| Electronics and Communication Engineering | Digital Signal Processing (DSP) | https://www.kaggle.com/datasets/emirhanai/advanced-signal-processing-dataset-from-ai-sensors |
| Wireless Communication | https://catalog.data.gov/dataset/?tags=wireless-communications-and-networks |
| 5G / Cellular Networks | https://www.kaggle.com/datasets/vinothkannaece/5g-network-data |
| Antenna & RF Systems | https://www.kaggle.com/datasets/suraj520/rf-signal-data |
| VLSI / IC Design | https://github.com/vlsi/calcite-test-dataset |
| Biomedical | PhysioNet (ECG, EEG, Vital Signs) | https://physionet.org/ |
| MIMIC-IV Clinical Database | https://www.kaggle.com/datasets/montassarba/mimic-iv-clinical-database-demo-2-2 |
| BraTS (Brain Tumor Segmentation) | https://www.med.upenn.edu/cbica/brats2020/data.html |
| COVID-19 Radiography Database | https://www.kaggle.com/datasets/tawsifurrahman/covid19-radiography-database |
| Renewable Energy | NREL Solar Power Data | https://github.com/Charlie5DH/Solar-Power-Datasets-and-Resources |
| NREL Wind Integration Datasets | https://www.nrel.gov/grid/wind-toolkit.html |
| Global Energy Forecasting Competition (GEFCom) | https://www.kaggle.com/competitions/GEF2012-wind-forecasting |
| Open Power System Data (Renewables) | https://data.open-power-system-data.org/ |
| Mechanical Engineering | NASA Turbofan Engine Degradation Simulation | https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/ |
| IC Engine Vibration / Sound Datasets | https://github.com/Charlie5DH/PredictiveMaintenance-and-Vibration-Resources |
| Structural Health Monitoring Sensor Data | https://www.kaggle.com/datasets/ziya07/building-structural-health-sensor-dataset |
| Robotics / Control Benchmark Datasets | https://github.com/mint-lab/awesome-robotics-datasets |
| Autonomous Vehicle Engineering | KITTI Vision Benchmark Suite | http://www.cvlibs.net/datasets/kitti/ |
| Waymo Open Dataset | https://waymo.com/open/ |
| ApolloScape | http://apolloscape.auto/ |
| Civil Engineering | Building Energy Dataset | https://www.kaggle.com/c/ashrae-energy-prediction |
| Pavia University Remote Sensing | https://www.kaggle.com/datasets/syamkakarla/pavia-university-hsi |
| UCI Concrete Compressive Strength | https://archive.ics.uci.edu/ml/datasets/Concrete+Compressive+Strength |
| Chemical Engineering | Catalysis Reaction Data (QM9 Molecules) | https://quantum-machine.org/datasets/ |
| Chemical Process Simulation (DREAM Challenge) | https://zenodo.org/records/3735364 |
| Industrial Chemical Sensor Data | https://archive.ics.uci.edu/dataset/45/heart+disease |
| Process Systems Engineering Datasets (PSE) | https://data.world/briannielsen/process-systems-engineering |
| Aerospace Engineering | NASA Airfoil Self-Noise Dataset | https://www.kaggle.com/datasets/fedesoriano/airfoil-selfnoise-dataset |
| UCI Flight Delay Dataset | https://www.transtats.bts.gov/OT_Delay/ |
| NASA Turbofan Engine Degradation (C-MAPSS) | https://github.com/kpeters/exploring-nasas-turbofan-dataset |
| OpenAeroStruct (Aero-Struct Optimization) | https://github.com/mdolab/OpenAeroStruct |
| ERA5 Atmospheric Reanalysis Data | https://cds.climate.copernicus.eu/datasets |
| Industrial Engineering | UCI Manufacturing Failure Detection | https://www.kaggle.com/datasets/ziya07/smart-manufacturing-iot-cloud-monitoring-dataset |
| SECOM Semiconductor Manufacturing Data | https://archive.ics.uci.edu/ml/datasets/SECOM |
| Tennessee Eastman Process Simulation | https://github.com/jonathanwvd/awesome-industrial-datasets/blob/master/markdown/tennessee_eastman_process_simulation_dataset.md |
| Open Jobs/Workforce Data (BLS) | https://www.bls.gov/data/ |
| Assembly Line Sensor Data | https://universe.roboflow.com/wd-rohcm/dataset-s7uii |
| Metallurgical Engineering | Materials Data Repository (NIST) | https://github.com/sedaoturak/data-resources-for-materials-science |
| Materials Project (Crystallography & Properties) | https://materialsproject.org/ |
| Open Quantum Materials Database (OQMD) | https://colab.research.google.com/github/Tony-Y/oqmd-v1.2-dataset-for-cgnn/blob/main/OQMD_v1_2_dataset_for_CGNN.ipynb |
| Materials Science Engineering | Materials Project Database | https://materialsproject.org/ |
| NIST Thermo-Calc Datasets | https://www.nist.gov/programs/projects/thermo-calc-data |
| Citrine Materials Data | https://citrine.io/media-post/data-highlight-materials-project-dataset/ |
| Jarvis DFT Database | https://jarvis.nist.gov/ |
| Mechatronics Engineering | UCI Human Activity Recognition Using Smartphones | https://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones |
| Robotic Grasping Dataset (Cornell) | https://www.kaggle.com/datasets/oneoneliu/cornell-grasp |
| OpenAI Gym Robotics Environments | https://github.com/openai/robogym |
| Mobile Robot Navigation (TurtleBot Logs) | https://zenodo.org/records/1188976 |
| Inertial Measurement Unit (IMU) Motion Data | https://www.kaggle.com/datasets/ziya07/ai-powered-imu-motion-dataset |
| Automobile Engineering | KITTI Autonomous Driving Dataset | http://www.cvlibs.net/datasets/kitti/ |
| NUScenes AV Dataset | https://www.kaggle.com/datasets/mitanshuchakrawarty/nuscenes |
| Car Evaluation Dataset | https://archive.ics.uci.edu/ml/datasets/Car+Evaluation |
| Vehicle Fuel Consumption (EPA) | https://www.fueleconomy.gov/feg/download.shtml |
| Open Traffic Data (HERE) | https://developer.here.com/products/traffic |
| Control Systems Engineering | UCI PID Controller Benchmark Data | https://archive.ics.uci.edu/ml/datasets/Servo |
| MATLAB/Simulink Control Test Cases (CORA) | https://in.mathworks.com/matlabcentral/fileexchange/68551-cora |
| Benchmark Control System Models (DAE) | https://www.cds.caltech.edu/~murray/wiki/ |
| Aircraft Control Simulation Logs | https://data.nas.nasa.gov/ |
| Instrumentation & Control Engineering | UCI Servo Control Dataset | https://archive.ics.uci.edu/ml/datasets/Servo |
| PID Tuning Benchmark (MATLAB/Simulink logs) | https://github.com/contractor-core/cora-benchmarks |
| Industrial Process Control Data (Tennessee Eastman) | https://github.com/jonathanwvd/awesome-industrial-datasets/blob/master/markdown/tennessee_eastman_process_simulation_dataset.md |
| Embedded Systems Engineering | WISDM Smartphone Data (Embedded Sensors) | https://www.kaggle.com/datasets/antonandreenko/industrial-control-system-ics-alarm-text-dataset |
| OpenEmbedded Benchmark Dataset | https://github.com/openembedded/ |
| IoT Traffic Dataset (UCI) | https://github.com/thieu1995/iot_dataset/blob/master/ReadMe.md |
| Arduino Sensor Dataset (UCI) | https://archive.ics.uci.edu/dataset/506/human+activity+recognition+from+continuous+ambient+sensor+data |
| VLSI Design Engineering | ISCAS Circuits Benchmark (VLSI Testing) | https://www.kaggle.com/datasets/hemanthhari/vlsi-data |
| OpenROAD VLSI Data (Layout/Synthesis) | https://github.com/The-OpenROAD-Project |
| ISPD Contest Benchmark Suites | https://universe.roboflow.com/casproject/ispd |
| Microelectronics Engineering | Microelectronic Failure Analysis Data | https://www.kaggle.com/datasets/umerrtx/machine-failure-prediction-using-sensor-data |
| SEM Image Dataset (Materials) | https://github.com/BAMresearch/automatic-sem-image-segmentation |
| Power Electronics Engineering | Power Electronics Converter Data (Simulation) | https://data.world/briannielsen/power-electronics |
| PEC Dataset (Inverter/Converter Logs) | https://www.kaggle.com/datasets/rusuanjun/pec-dataset |
| Electric Vehicle Powertrain Data | https://data.gov/transportation/ |
| Grid-Connected Inverter Dataset | https://www.nrel.gov/grid/solar-power-data.html |
| Biotechnology Engineering | Genomic Data (NCBI SRA) | https://www.ncbi.nlm.nih.gov/sra |
| TCGA Cancer Genomics Dataset | https://portal.gdc.cancer.gov/ |
| Human Microbiome Project Data | https://github.com/awslabs/open-data-registry/tree/main/datasets |
| Protein Data Bank (PDB) | https://catalog.data.gov/dataset/protein-data-bank-pdb |
| KEGG Pathway Database | https://www.genome.jp/kegg/ |
| Pharmaceutical Engineering | PubChem BioAssay | https://archive.ics.uci.edu/dataset/209/pubchem+bioassay+data |
| DrugBank (Drug Data) | https://www.kaggle.com/datasets/aryelbezerra/drugbank-approved-drugs-dataset |
| ChEMBL (Bioactive Molecules) | https://github.com/awslabs/open-data-registry/blob/main/datasets/chembl.yaml |
| Genetic Engineering | NCBI Gene Expression Omnibus (GEO) | https://www.ncbi.nlm.nih.gov/geo/ |
| 1000 Genomes Project | https://www.internationalgenome.org/data |
| Drosophila RNA-Seq (modENCODE) | https://www.kaggle.com/datasets/tianjiechen/tcga-rna-datasets |
| Food Technology Engineering | Food Composition Database (USDA) | https://fdc.nal.usda.gov/ |
| Food Quality & Safety Data (UCI) | https://www.kaggle.com/datasets/nikhitaganiger/food-safety |
| Food Microbiology (FERMENTATION) | https://data.world/makefoodsafe/fermentation |
| Agricultural Engineering | UCI Crop Dataset (Plant Seedlings) | https://www.kaggle.com/competitions/plant-seedlings-classification |
| FAOSTAT Agriculture Data | https://www.fao.org/faostat/en/ |
| Weather & Yield (Global) | https://www.ecmwf.int/en/forecasts/datasets |
| Irrigation & Soil Moisture Data (USDA) | https://catalog.data.gov/dataset/?tags=irrigation |
| Dairy Technology Engineering | Milk Composition Database (IDF) | https://github.com/saideepaknagaraj/Milk-spectra-analysis |
| UCI Fermented Dairy Dataset | https://www.kaggle.com/datasets/suraj520/dairy-goods-sales-dataset |
| Power Systems Engineering | NREL Renewable Integration Data | https://registry.opendata.aws/nrel-pds-wtk/ |
| European Transmission System Data | https://www.entsoe.eu/data/ |
| UCI Electrical Grid Stability | https://archive.ics.uci.edu/dataset/471/electrical+grid+stability+simulated+data |
| Geological Engineering | USGS Earthquake Catalog | https://www.kaggle.com/datasets/rupindersinghrana/usgs-earthquakes-2024 |
| OneGeology Global Geoscience Data | https://www.kaggle.com/competitions/geology-forecast-challenge-open |
| Mineral Resources Data System (USGS) | https://github.com/DOI-USGS/dataretrieval-python |
| Geo-Environmental Engineering | Global Soil Data (ISRIC-World Soil Information) | https://data.nasa.gov/dataset/global-data-set-of-derived-soil-properties-0-5-degree-grid-isric-wise-3fec2 |
| Air Quality Open Data (OpenAQ) | https://openaq.org/ |
| Water Quality Portal (USGS + EPA) | https://www.waterqualitydata.us/ |
| NASA Earthdata Environmental Datasets | https://www.kaggle.com/datasets/ivansher/nasa-nearest-earth-objects-1910-2024 |
| Nanotechnology Engineering | Nanomaterial Registry | https://github.com/NanoCommons/datasets |
| Materials Project (Nano/Crystalline Data) | https://materialsproject.org/ |
| PubChem Nanomaterials | https://pubchem.ncbi.nlm.nih.gov/ |
| Networking | CAIDA Internet Traffic Dataset | https://www.caida.org/data/passive/ |
| MAWI Working Group Traffic Archive | https://mawi.wide.ad.jp/mawi/ |
| Internet Traffic Archive (ITA) | https://www.kaggle.com/datasets/ravikumargattu/network-traffic-dataset/data |
| Cybersecurity | CSE-CIC-IDS 2018 | https://www.unb.ca/cic/datasets/ids-2018.html |
| MITRE ATT&CK Evaluations Dataset | https://github.com/mitre-attack/attack-stix-data |
| DARPA Intrusion Detection Dataset | https://www.ll.mit.edu/r-d/datasets |
| Network Security | UNSW-NB15 Dataset | https://research.unsw.edu.au/projects/unsw-nb15-dataset |
| NSL-KDD Dataset | https://www.unb.ca/cic/datasets/nsl.html |
| Bot-IoT Dataset | https://research.unsw.edu.au/projects/bot-iot-dataset |
| Wireless Sensor Network (WSN) | Intel Berkeley Research Lab Sensor Dataset | http://db.csail.mit.edu/labdata/labdata.html |
| LUCE Environmental Sensor Dataset | https://www.kaggle.com/datasets/garystafford/environmental-sensor-data-132k |
| UCI Sensorless Drive Diagnosis | https://archive.ics.uci.edu/datasets?search=Sensorless_drive_diagnosis |
| Wireless Communication | DeepSig RadioML Dataset | https://github.com/sofwerx/deepsig_datasets |
| Wireless InSite Channel Dataset | https://www.qualcomm.com/developer/software/wireless-indoor-simulations-dataset |
| NYU Wireless mmWave Channel Measurements | https://archive.ics.uci.edu/datasets?search=Sensorless_drive_diagnosis |
| Network Communication | Stanford SNAP Communication Networks | https://snap.stanford.edu/data/ |
| Email Communication Network (Enron) | https://www.cs.cmu.edu/~enron/ |
| EU Email Communication Dataset | https://snap.stanford.edu/data/email-Eu-core.html |
| Satellite Communication | NASA Space Communications Dataset | https://data.nasa.gov/ |
| ESA Satellite Telemetry Data | https://www.kaggle.com/datasets/sammahoney/esa-anomaly-dataset |
| SATCOM Channel Measurement Dataset | https://github.com/clarkzjw/LENS |
| Telecommunication | ITU Telecommunication Indicators | https://data360.worldbank.org/en/dataset/ITU_DH |
| Ofcom Telecom Market Data | https://www.ofcom.org.uk/research-and-data |
| Telecom Italia Network Traffic Dataset | https://doi.org/10.7910/DVN/3QBYB5 |
| Broadband Measurement Data (FCC MBA) | https://www.fcc.gov/general/measuring-broadband-america |
| CAIDA Internet Topology Data | https://www.caida.org/data/ |
| Edge Computing | IoT Edge Analytics Dataset (UCI) | https://www.kaggle.com/datasets/mohamedamineferrag/edgeiiotset-cyber-security-dataset-of-iot-iiot |
| Edge AI Sensor Dataset (WISDM) | https://archive.ics.uci.edu/ml/datasets/WISDM+Smartphone+and+Smartwatch+Activity+and+Biometry+Dataset |
| Azure IoT Edge Telemetry Data | https://www.kaggle.com/code/chaozhuang/iot-telemetry-sensor-data-analysis |
| Fog Computing | Smart City Fog Computing Dataset | https://github.com/IBM/smart-city-analytics |
| IoT-Fog Resource Usage Dataset | https://www.kaggle.com/datasets/ziya07/multi-tier-iot-resource-allocation-dataset |
| Vehicular Fog Computing Dataset | https://github.com/aniketmaurya/vehicular-fog-dataset |
| Cloud Fog Task Scheduling Dataset | https://data.world/uci/fog-computing |
| Smart Healthcare Fog Dataset | https://www.kaggle.com/datasets/acharyakamal/smart-healthcare-prediction-management-system |
| Optical Communication | Optical Fiber Channel Dataset | https://github.com/functions-lab/COSMOS-EDFA-Dataset |
| Coherent Optical Communication Dataset | https://zenodo.org/records/4553836 |
| Optical Signal Modulation Dataset | https://ieee-dataport.org/open-access/optical-communication-datasets |
| Nonlinear Fiber Optics Dataset | https://catalog.data.gov/dataset/?tags=fiber-optic |
| Optical Noise Measurement Dataset | https://zenodo.org/records/8392622 |
| Optical Network | GÉANT Network Topology Data | https://ieee-dataport.org/documents/traffic-datsets-abilene-geant-taxibj |
| Optical Transport Network Dataset (OTN) | https://ieee-dataport.org/open-access/optical-network-datasets |
| WDM Network Benchmark Dataset | https://www.kaggle.com/competitions/ofc-2026-ml-challenge |
| Flex-Grid Optical Network Dataset | https://zenodo.org/records/3696817 |
| Cellular Network | CRAwdAD Cellular Network Traces | https://crawdad.org/ |
| OpenCellID (Cell Tower Data) | https://www.opencellid.org/ |
| MIT Reality Mining Dataset | http://realitycommons.media.mit.edu/realitymining.html |
| Telecom Italia Mobile Dataset | https://dandelion.eu/datamine/open-big-data/ |
| 5G Dataset (InterDigital) | https://www.kaggle.com/datasets/vinothkannaece/5g-network-data |
| Mobile Communication | UCI Human Mobility Dataset | https://archive.ics.uci.edu/dataset/240/human+activity+recognition+using+smartphones |
| Mobile Phone Usage Dataset (D4D Orange) | https://www.kaggle.com/datasets/bhadramohit/smartphone-usage-and-behavioral-dataset |
| MIT Smartphone Sensing Dataset | https://www.kaggle.com/datasets/prince7489/smartphone-usage-dataset |
| Wireless Mobility Traces (CRAWDAD) | https://crawdad.org/ |
| Cell Phone Activity Dataset (Telecom Italia) | https://www.kaggle.com/code/ijfezika/mobile-phone-activity-exploratory-analysis |
| Distributed Computing | Google Cluster Workload Traces | https://github.com/google/cluster-data |
| Alibaba Cluster Trace Dataset | https://github.com/alibaba/clusterdata |
| Grid Workload Archive | https://gwa.ewi.tudelft.nl/ |
| HPC Job Scheduling Dataset | https://zenodo.org/records/3634616 |
| Distributed Systems Benchmark (DeathStarBench) | https://github.com/delimitrou/DeathStarBench |
| Cloud Computing | Google Cloud Trace Dataset | https://github.com/google/cluster-data |
| Azure Public Cloud Dataset | https://www.kaggle.com/datasets/rishi2123/oragnizations-expenses-2023-2024 |
| AWS Open Data Registry | https://registry.opendata.aws/ |
| OpenDC Cloud Workload Dataset | https://github.com/atlarge-research/opendc |
| Bitbrains Cloud Workload Traces | https://www.kaggle.com/datasets/gauravdhamane/gwa-bitbrains |
| Computer Vision | COCO (Common Objects in Context) | https://cocodataset.org/ |
| Pascal VOC | http://host.robots.ox.ac.uk/pascal/VOC/ |
| Cityscapes | https://www.cityscapes-dataset.com/ |
| Open Images Dataset | https://storage.googleapis.com/openimages/web/index.html |
| Pattern Recognition | MNIST Handwritten Digits | http://yann.lecun.com/exdb/mnist/ |
| EMNIST Extended Digits & Letters | https://www.nist.gov/itl/products-and-services/emnist-dataset |
| UCI Optical Recognition of Handwritten Digits | https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits |
| ISOLET Spoken Letter Dataset | https://archive.ics.uci.edu/ml/datasets/isolet |
| Caltech 101 | https://data.caltech.edu/records/20086 |
| Remote Sensing | Landsat Satellite Imagery | https://landsat.gsfc.nasa.gov/data/ |
| Sentinel-2 Satellite Data | https://www.kaggle.com/datasets/salmaadell/eurosat-rgb |
| UC Merced Land Use Dataset | https://www.kaggle.com/datasets/abdulhasibuddin/uc-merced-land-use-dataset |
| ISPRS Aerial Image Dataset | https://github.com/whuwuteng/Aerial_Stereo_Dataset |
| MODIS Earth Observation Data | https://modis.gsfc.nasa.gov/data/ |
| Natural Language Processing (NLP) | SQuAD (Question Answering) | https://rajpurkar.github.io/SQuAD-explorer/ |
| WikiText Language Modeling Dataset | https://www.kaggle.com/datasets/rohitgr/wikitext |
| Common Crawl Text Corpus | https://www.kaggle.com/datasets/jyesawtellrickson/commoncrawl |
| IMDB Movie Reviews | https://ai.stanford.edu/~amaas/data/sentiment/ |
| Image Processing | Berkeley Segmentation Dataset (BSDS500) | https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/ |
| USC SIPI Image Database | https://sipi.usc.edu/database/ |
| Set12 Image Denoising Dataset | https://github.com/cszn/DnCNN |
| DIV2K High-Resolution Images | https://data.vision.ee.ethz.ch/cvl/DIV2K/ |
| Kodak Image Dataset | https://r0k.us/graphics/kodak/ |
| Signal Processing | MIT-BIH Arrhythmia ECG Dataset | https://physionet.org/content/mitdb/ |
| Speech Commands Dataset | https://www.tensorflow.org/datasets/catalog/speech_commands |
| RadioML Signal Modulation Dataset | https://www.kaggle.com/datasets/pinxau1000/radioml2018 |
| UCI Gas Sensor Array Drift Dataset | https://archive.ics.uci.edu/dataset/224/gas+sensor+array+drift+dataset |
| EEG Motor Movement Dataset | https://physionet.org/content/eegmmidb/ |
| Biomedical | PhysioNet Clinical Signals | https://physionet.org/ |
| MIMIC-IV Clinical Database | https://www.kaggle.com/datasets/montassarba/mimic-iv-clinical-database-demo-2-2 |
| BraTS Brain Tumor Dataset | https://www.med.upenn.edu/cbica/brats2020/data.html |
| NIH Chest X-ray Dataset | https://nihcc.app.box.com/v/ChestXray-NIHCC |
| ADNI Alzheimer’s Dataset | https://adni.loni.usc.edu/ |
| Big Data | Google Cluster Workload Traces | https://github.com/google/cluster-data |
| Amazon Reviews Dataset | https://registry.opendata.aws/amazon-reviews/ |
| NYC Taxi Trip Records | https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page |
| Wikipedia Data Dumps | https://dumps.wikimedia.org/ |
| Common Crawl Web Data | https://www.kaggle.com/datasets/jyesawtellrickson/commoncrawl |
| Software Engineering | PROMISE Software Defect Dataset | https://github.com/feiwww/PROMISE-backup |
| CodeSearchNet | https://github.com/github/CodeSearchNet |
| Apache Software Logs | https://www.kaggle.com/datasets/omduggineni/loghub-apache-log-data |
| GitHub Public Dataset (BigQuery) | https://cloud.google.com/bigquery/public-data/github |
| NASA Software Defect Dataset | https://github.com/klainfo/NASADefectDataset |
| Power Electronics | Power Converter Dataset (Zenodo) | https://zenodo.org/records/3606180 |
| PEMS Power Electronics Measurements | https://www.kaggle.com/datasets/sepandhaghighi/proton-exchange-membrane-pem-fuel-cell-dataset |
| Inverter Fault Diagnosis Dataset | https://www.kaggle.com/datasets/ziya07/fault-diagnosis-dataset-for-new-energy-vehicles |
| Electric Drive & Converter Dataset (UCI) | https://archive.ics.uci.edu/dataset/321/electricityloaddiagrams20112014 |
| Power Systems | IEEE Power System Test Cases | https://labs.ece.uw.edu/pstca/ |
| MATPOWER Power Grid Data | https://matpower.org/download/ |
| ENTSO-E Electricity Network Data | https://www.entsoe.eu/data/ |
| NREL Power System Data | https://www.nrel.gov/grid/data-tools.html |
| UCI Electrical Grid Stability Dataset | https://archive.ics.uci.edu/ml/datasets/Electrical+Grid+Stability+Simulated+Data |
| Open Power System Data | https://data.open-power-system-data.org/ |
| Wind Turbine / Solar Energy | NREL Wind Toolkit | https://www.nrel.gov/grid/wind-toolkit.html |
| NREL Solar Power Data | https://www.nrel.gov/grid/solar-power-data.html |
| Wind Turbine SCADA Dataset | https://data.world/energi/wind-turbine-scada |
| Global Solar Atlas Data | https://gee-community-catalog.org/projects/gsa/ |
| GEFCom Renewable Forecasting Dataset | https://www.kaggle.com/competitions/GEF2012-wind-forecasting |
| COCO Dataset | https://cocodataset.org/ |
| Open Images Dataset | https://storage.googleapis.com/openimages/web/index.html |
| AI2 ARC Reasoning Dataset | https://allenai.org/data/arc |
| CLEVR Reasoning Dataset | https://cs.stanford.edu/people/jcjohns/clevr/ |
| Artificial Intelligence | UCI Machine Learning Repository | https://archive.ics.uci.edu/ml/index.php |
| OpenML Benchmark Datasets | https://www.openml.org/ |
| LIBSVM Dataset Collection | https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/ |
| StatLib Datasets | https://lib.stat.cmu.edu/datasets/ |
| Deep Learning | MNIST | https://git-disl.github.io/GTDLBench/datasets/mnist_datasets/ |
| CIFAR-10 / CIFAR-100 | https://www.cs.toronto.edu/~kriz/cifar.html |
| SVHN Dataset | http://ufldl.stanford.edu/housenumbers/ |
| CelebA Face Dataset | https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html |
| ImageNet-21K | https://www.image-net.org/download |
| DeepSig RadioML Dataset | https://www.deepsig.ai/datasets/ |
| AI LLM (Large Language Models) | The Pile (Massive Text Corpus) | https://pile.eleuther.ai/ |
| C4 (Colossal Clean Crawled Corpus) | https://www.tensorflow.org/datasets/catalog/c4 |
| WikiText-103 | https://huggingface.co/datasets/Salesforce/wikitext/tree/main/wikitext-103-raw-v1 |
| OpenWebText | https://github.com/jcpeterson/openwebtext |
| BigScience ROOTS Corpus | https://huggingface.co/bigscience-corpus |
| AI SLM | Alpaca Instruction Dataset | https://github.com/tatsu-lab/stanford_alpaca |
| FLAN Instruction Dataset | https://github.com/google-research/FLAN |
| Dolly Instruction Dataset | https://huggingface.co/datasets/databricks/databricks-dolly-15k |
| TinyStories Dataset | https://huggingface.co/datasets/roneneldan/TinyStories |
| Artificial General Intelligence | ARC (Abstraction and Reasoning Corpus) | https://allenai.org/data/arc |
| BabyAI Platform & Dataset | https://github.com/mila-iqia/babyai |
| bAbI Reasoning Tasks | https://github.com/facebookarchive/bAbI-tasks |
| CLEVR Compositional Reasoning Dataset | https://cs.stanford.edu/people/jcjohns/clevr/ |
| Neuro-Symbolic AI | CLEVRER (Causal & Symbolic Reasoning) | http://clevrer.csail.mit.edu/ |
| DeepProbLog Datasets | https://github.com/ML-KULeuven/deepproblog |
| Abduction and Argumentation Dataset | https://github.com/AbductiveLearning/ABLSim |
| Logic Tensor Networks Benchmarks | https://github.com/logictensornetworks/ltntorch |
| Cognitive Computing | OpenCog Cognitive Datasets | https://github.com/opencog/opencog |
| ATOMIC Commonsense Knowledge Graph | https://allenai.org/data/atomic |
| ConceptNet | https://conceptnet.io/ |
| MindBigData (Human Thought Data) | http://www.mindbigdata.com/opendb/ |
| Self-Supervised Learning | ImageNet (Unlabeled / SSL Use) | https://www.image-net.org/ |
| STL-10 Dataset | https://cs.stanford.edu/~acoates/stl10/ |
| AudioSet (Self-Supervised Audio) | https://research.google.com/audioset/ |
| Kinetics Video Dataset | https://github.com/cvdfoundation/kinetics-dataset |
| LibriSpeech (SSL for Speech) | https://www.openslr.org/12 |
| Federated Learning | LEAF Federated Learning Benchmark | https://leaf.cmu.edu/ |
| FedScale Dataset Suite | https://github.com/SymbioticLab/FedScale |
| Google Federated EMNIST | https://figshare.com/articles/dataset/Federated_EMNIST_Dataset/26308777 |
| NIID-Bench Federated Dataset | https://github.com/Xtra-Computing/NIID-Bench |
| Explainable AI | UCI Adult Dataset (XAI Benchmark) | https://archive.ics.uci.edu/ml/datasets/adult |
| COMPAS Recidivism Dataset | https://www.kaggle.com/datasets/danofer/compass |
| OpenML Explainability Benchmarks | https://www.openml.org/search?type=data |
| MIMIC-IV (Clinical XAI) | https://mimic.physionet.org/ |
| FICO Explainable ML Challenge Dataset | https://www.kaggle.com/datasets/lhagiimn/fico-dataset |
| Quantum Machine Learning | QML Benchmark Datasets (IBM) | https://huggingface.co/datasets/Cohaerence/ibm-qml-kernel |
| Quantum Data Sets (UCI-style) | https://quantum-machine.org/datasets/ |
| QASM Circuit Dataset | https://github.com/FujiiLabCollaboration/MNISQ-quantum-circuit-dataset |
| QML Toy Datasets (PennyLane) | https://pennylane.ai/datasets/collection/qml-benchmarks |
| Edge AI / TinyML | Google Speech Commands (TinyML) | https://www.tensorflow.org/datasets/catalog/speech_commands |
| MLPerf Tiny Benchmark Dataset | https://github.com/mlcommons/tiny |
| Edge Impulse Public Datasets | https://docs.edgeimpulse.com/datasets |
| UCI HAR (Embedded Sensors) | https://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones |
| WISDM Wearable Dataset | https://archive.ics.uci.edu/dataset/507/wisdm+smartphone+and+smartwatch+activity+and+biometrics+dataset |
| Generative AI | LAION-5B Multimodal Dataset | https://laion.ai/blog/laion-5b/ |
| The Pile (Text Generation) | https://pile.eleuther.ai/ |
| Common Crawl | https://commoncrawl.org/ |
| CelebA (Image Generation) | https://mmlab.ie.cuhk.edu.hk/projects/CelebA.html |
| MusicNet (Audio Generation) | https://homes.cs.washington.edu/~thickstn/musicnet.html |
| CodeSearchNet (Code Generation) | https://github.com/github/CodeSearchNet |
| Neuromorphic Computing | Spiking Heidelberg Digits (SHD) | https://ieee-dataport.org/open-access/heidelberg-spiking-datasets |
| Spiking Speech Commands | https://zenkelab.org/resources/spiking-heidelberg-datasets-shd/ |
| DVS Gesture Dataset | https://github.com/VicenteAlex/DVS-Gesture-Chain?tab=readme-ov-file |
| N-MNIST | https://www.garrickorchard.com/datasets/n-mnist |
| Data Science and Analytics | UCI Adult Income Dataset | https://archive.ics.uci.edu/ml/datasets/adult |
| OpenML Benchmark Suite | https://www.openml.org/search?type=data |
| World Bank Open Data | https://data.worldbank.org/ |
| NYC Open Data | https://www.kaggle.com/datasets/nycopendata/new-york |
| Self-Supervised Learning | YCB Object and Model Set | https://www.ycbbenchmarks.com/ |
| RoboNet Dataset | https://github.com/SudeepDasari/RoboNet |
| DROID Robot Manipulation Dataset | https://droid-dataset.github.io/ |
| Oxford RobotCar Dataset | https://robotcar-dataset.robots.ox.ac.uk/ |
| Signals and Systems | MIT-BIH Arrhythmia Database | https://physionet.org/content/mitdb/ |
| ECG-ID Database | https://physionet.org/content/ecgiddb/ |
| NOAA Signal Data | https://www.ngdc.noaa.gov/ |
| UCR Time Series Archive | https://www.timeseriesclassification.com/ |
| PhysioNet Signal Archive | https://physionet.org/about/database/ |
| Blockchain | Ethereum Blockchain Dataset | https://www.kaggle.com/datasets/bigquery/ethereum-blockchain |
| Bitcoin Historical Data | https://www.kaggle.com/datasets/mczielinski/bitcoin-historical-data |
| Elliptic Bitcoin Dataset | https://www.kaggle.com/datasets/ellipticco/elliptic-data-set |
| Blockchain.com Charts Data | https://www.blockchain.com/charts |
| 5G Network | ITU IMT-2020 Evaluation Data | https://www.itu.int/md/meetingdoc.asp?lang=en&parent=R19-IMT.2020.SAT-C&source=WP4B |
| Open5GCore Datasets | https://github.com/OPENAIRINTERFACE/openairinterface5g |
| 5G NR Channel Model Data | https://zenodo.org/records/15210986 |
| VANET | VeReMi Dataset | https://veremi-dataset.github.io/ |
| Luxembourg SUMO Traffic Dataset | https://github.com/lcodeca/LuSTScenario |
| NGSIM Vehicle Trajectories | https://www.kaggle.com/datasets/nigelwilliams/ngsim-vehicle-trajectory-data-us-101 |
| TAPAS Cologne Traffic Dataset | https://sumo.dlr.de/docs/Data/Scenarios/TAPASCologne.html |
| V2X Communication | OPV2V Autonomous Driving Dataset | https://mobility-lab.seas.ucla.edu/opv2v/ |
| DAIR-V2X Dataset | https://github.com/AIR-THU/DAIR-V2X?tab=readme-ov-file#dataset |
| V2X-Sim Dataset | https://github.com/ai4ce/V2X-Sim |
| OFDM Wireless Communication | COST 207 Channel Model Data | https://onlinelibrary.wiley.com/doi/epdf/10.1002/0470847808.app5 |
| IEEE 802.11 OFDM Signal Dataset | https://ieee-dataport.org/open-access/deepwiphy-synthetic-and-real-world-ieee-80211ax-ofdm-symbol-dataset |
| Wireless InSite Ray-Tracing Dataset | https://github.com/sowang46/mmWave_V2X_dataset |
| MANET (Mobile Ad Hoc Networks) | CRAWDAD MANET Traces | https://crawdad.org/ |
| MIT MANET Mobility Dataset | https://tracebase.org/tracebase/ |
| Manet Routing Dataset | https://www.kaggle.com/datasets/gymprathap/manet-routing-dataset |
| SDN (Software Defined Networking) | ARP SDN Traffic Dataset | https://github.com/nisha077/ARP-SDN-Dataset |
| InSDN Dataset | https://www.kaggle.com/datasets/badcodebuilder/insdn-dataset |
| SDN DDoS Dataset | https://ieee-dataport.org/documents/sdn-ddos-attack-image-dataset |
| Mininet Traffic Dataset | https://data.mendeley.com/datasets/9hz6f62gtk/1 |
| Underwater Sensor Network | SFI Smart Ocean Acoustic Dataset | https://ieee-dataport.org/open-access/sfi-smart-ocean-dataset-underwater-acoustic-communications |
| Underwater Acoustic Communication Dataset | https://ieee-dataport.org/documents/band-full-duplex-underwater-acoustic-communication-measurements-lake-environment |
| UUV Simulator Data | https://catalog.data.gov/dataset/teamer-electrically-engaged-undulation-system-for-unmanned-underwater-vehicles-7ffb5 |
| Sea Trial Acoustic Dataset | https://zenodo.org/records/6372728 |
| IoT (Internet of Things) | IoT-23 Dataset | https://www.stratosphereips.org/datasets-iot23 |
| TON_IoT Dataset | https://research.unsw.edu.au/projects/toniot-datasets |
| Intel Lab IoT Sensor Dataset | https://db.csail.mit.edu/labdata/labdata.html |
| Smart Home IoT Dataset | https://www.kaggle.com/datasets/taranvee/smart-home-dataset-with-weather-information |
| Quantum Networking | Quantum Internet Dataset | https://zenodo.org/records/17504715 |
| IBM Quantum Network Data | https://quantum-computing.ibm.com/services/resources |
| Quantum Entanglement Network Dataset | https://zenodo.org/records/8279583 |
| QKD Experimental Dataset | https://archive.researchdata.leeds.ac.uk/1285/ |
| 6G Networks | 6G Channel Measurement Dataset | https://github.com/ocatak/6g-channel-estimation-dataset |
| Terahertz Communication Dataset | https://ieee-dataport.org/documents/measurement-based-parameterization-physics-reflection-models-terahertz-communication-s21 |
| Hexa-X 6G Dataset | https://zenodo.org/records/17396743 |
| AI-enabled 6G Network Dataset | https://www.kaggle.com/datasets/ziya07/dynamic-network-slicing-dataset-in-6g-networks |
| Network Routing | Rocketfuel ISP Topology Dataset | https://www.cs.washington.edu/research/networking/rocketfuel/ |
| CAIDA Internet Topology Data | https://www.caida.org/catalog/datasets/ |
| INET Routing Dataset | https://www.kaggle.com/datasets/asfandyar250/network |
| Opensource Routing Traces | https://github.com/BNN-UPC/NetworkModelingDatasets |
| Intrusion Detection System | CIC-IDS2017 | https://www.unb.ca/cic/datasets/ids-2017.html |
| UNSW-NB15 | https://research.unsw.edu.au/projects/unsw-nb15-dataset |
| NSL-KDD | https://www.unb.ca/cic/datasets/nsl.html |
| TII-SSRC-23 | https://ieee-dataport.org/documents/tii-ssrc-23-dataset-edited |
| DARPA IDS Dataset | https://www.ll.mit.edu/r-d/datasets/1998-darpa-intrusion-detection-evaluation-dataset |
| MIMO (Multiple Input Multiple Output) | DeepMIMO Dataset | https://github.com/DeepMIMO/DeepMIMO |
| NYU Wireless mmWave MIMO Dataset | https://github.com/nyu-wireless/mmwRobotNav |
| Massive MIMO Channel Measurements | https://ieee-dataport.org/open-access/beamspace-channel-dataset-mmwave-massive-mimo |
| COST 2100 MIMO Dataset | https://www.kaggle.com/datasets/forment/cost2100 |
| Cognitive Radio Networks | CRAWDAD Spectrum Occupancy Measurements | https://crawdad.org/ |
| Spectrum Measurement Dataset | https://www.kaggle.com/datasets/ajithdari/cass-spectrum-dataset |
| Electrosense Radio Spectrum Dataset | https://zenodo.org/records/7521246 |
| IEEE 802.22 WRAN Simulation Data | https://www.ieee802.org/22/ |
| Digital Forensics | Digital Corpora Forensic Images | https://digitalcorpora.org/ |
| DFRWS Forensic Challenge Datasets | https://www.dfrws.org/forensic-challenges/ |
| NIST CFReDS Dataset | https://cfreds.nist.gov/ |
| UC Irvine Memory Forensics Dataset | https://daniyyell.com/datasets/Memory-Forensics-Attack-Simulation-Dataset/ |
| Wireless Body Area Network (WBAN) | MHEALTH Dataset | https://archive.ics.uci.edu/ml/datasets/mhealth+dataset |
| WISDM Wearable Sensor Dataset | https://www.cis.fordham.edu/wisdm/dataset.php |
| BSN Challenge Dataset | https://physionet.org/content/bhi-2018-challenge/1.0/ |
| LTE (Long Term Evolution) | OpenAirInterface LTE Dataset | https://data.europa.eu/data/datasets/oai-zenodo-org-10811147?locale=de |
| LTE Drive Test Dataset | https://ieee-dataport.org/open-access/technical-university-denmark-lte-drive-test-measurements |
| Vienna LTE-A Link Level Simulator Data | https://arxiv.org/html/2603.02638v1 |
| Ad Hoc Networks | MIT Reality Mining Dataset | http://realitycommons.media.mit.edu/realitymining.html |
| FAN-GHETS24 Ad Hoc Dataset | https://zenodo.org/records/13315419 |
| Helsinki Mobility Traces | https://www.tracebase.org/tracebase/ |
| Forensic Science | Digital Corpora Forensic Images | https://digitalcorpora.org/ |
| NIST CFReDS | https://cfreds.nist.gov/ |
| DFRWS Forensic Challenge Datasets | https://www.dfrws.org/forensic-challenges/ |
| GovDocs1 Forensic Corpus | https://digitalcorpora.org/corpora/govdocs |
| Psychology | Open Psychometrics Data | https://openpsychometrics.org/_rawdata/ |
| Human Connectome Project | https://www.humanconnectome.org/ |
| Child Mind Institute Dataset | https://www.kaggle.com/competitions/child-mind-institute-problematic-internet-use/data |
| MIDUS Psychological Study | https://midus.wisc.edu/data-access/ |
| APA Open Data Repository | https://www.apa.org/pubs/databases |
| Public Administration | World Bank Governance Indicators | https://www.imf.org/en/publications/sprolls/world-economic-outlook-databases |
| OECD Public Governance Data | https://oecd-public-integrity-indicators.org/indicators/ |
| UN Public Administration Dataset | https://publicadministration.un.org/ |
| USA Government Open Data | https://www.data.gov/ |
| European Open Government Data | https://data.europa.eu/ |
| Economics | Penn World Table | https://www.rug.nl/ggdc/productivity/pwt/ |
| IMF World Economic Outlook Data | https://www.data.imf.org/en |
| World Bank Development Indicators | https://databank.worldbank.org/source/world-development-indicators |
| OECD Economic Outlook | https://data-explorer.oecd.org/ |
| FRED Economic Data | https://fred.stlouisfed.org/ |
| International Relations | Correlates of War Dataset | https://correlatesofwar.org/ |
| UCDP Conflict Dataset | https://ucdp.uu.se/ |
| GDELT Global Events Database | https://www.gdeltproject.org/ |
| World Trade Organization Statistics | https://stats.wto.org/ |
| SIPRI Military Expenditure Database | https://www.sipri.org/databases/milex |
| Education | National Center for Education Statistics | https://catalog.data.gov/dataset?publisher=NationalCenterforEducationStatistics%28NCES%29 |
| OECD PISA Dataset | https://www.oecd.org/pisa/data/ |
| World Bank Education Statistics | https://databank.worldbank.org/source/education-statistics |
| Open University Learning Analytics Dataset | https://analyse.kmi.open.ac.uk/open-dataset |
| UCI Student Performance Dataset | https://archive.ics.uci.edu/ml/datasets/student+performance |
| Commerce | UN Comtrade International Trade Data | https://comtrade.un.org/ |
| World Bank Enterprise Surveys | https://data360.worldbank.org/en/dataset/WB_ES |
| Retail Scanner Data (US Census) | https://www.kaggle.com/datasets/census/retail-and-retailers-sales-time-series-collection |
| Eurostat Business Statistics | https://ec.europa.eu/eurostat |
| Global Financial Data | https://github.com/JerBouma/FinanceDatabase |
| Business Administration | Harvard Dataverse Business Datasets | https://dataverse.harvard.edu/ |
| Crunchbase Open Data Map | https://data.crunchbase.com/ |
| Compustat Financial Dataset | https://www.marketplace.spglobal.com/ |
| IBM HR Analytics Dataset | https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset |
| Wharton Research Data Services | https://wrds.wharton.upenn.edu/ |
| Physics | CERN Open Data Portal | https://opendata.cern.ch/ |
| NASA Physical Sciences Data | https://pds.nasa.gov/ |
| LIGO Open Science Center | https://losc.ligo.org/ |
| Materials Project Dataset | https://materialsproject.org/ |
| NIST Physical Measurement Data | https://www.nist.gov/data |
| Chemistry | PubChem Database | https://pubchem.ncbi.nlm.nih.gov/ |
| ChemSpider | https://www.chemspider.com/ |
| NIST Chemistry WebBook | https://webbook.nist.gov/chemistry/ |
| Harvard Clean Energy Project Dataset | https://cepdb.molecularspace.org/ |
| QM9 Molecular Dataset | https://deepchemdata.s3-us-west-1.amazonaws.com/datasets/qm9.csv |
| Mathematics | OEIS Integer Sequences | https://oeis.org/ |
| L-Functions and Modular Forms Database | https://www.lmfdb.org/ |
| UCI Mathematical Datasets | https://archive.ics.uci.edu/ml/index.php |
| Numerical Dataset Archive | https://www.kaggle.com/datasets/subhashinimariappan/numerical-dataset |
| Kaggle Mathematical Modeling Data | https://www.kaggle.com/datasets/xinyilea/mathematical-modeling-data/data |
| Computational Science | NERSC Scientific Data Repository | https://www.nersc.gov |
| Argonne Leadership Computing Facility Data | https://ieee-dataport.org/documents/argonne-leadership-computing-facility-data-catalog |
| NASA High-End Computing Data | https://www.nas.nasa.gov/hecc/ |
| LANL Simulation Datasets | https://www.kaggle.com/c/LANL-Earthquake-Prediction |
| Statistics | StatLib Data Archive | http://lib.stat.cmu.edu/datasets/ |
| UCI Machine Learning Repository | https://archive.ics.uci.edu/ml/ |
| World Bank Statistical Data | https://databank.worldbank.org/ |
| OECD Statistics | https://stats.oecd.org/ |
| US Census Bureau Statistics | https://www.kaggle.com/datasets/census/census-bureau-usa |
| Biology | NCBI BioProject | https://www.ncbi.nlm.nih.gov/bioproject/ |
| Ensembl Genome Database | https://www.useast.ensembl.org/ |
| Human Protein Atlas | https://www.proteinatlas.org/ |
| PDB Biological Structures | https://www.rcsb.org/ |
| BioStudies Database | https://www.ebi.ac.uk/biostudies/ |
| Botany | TRY Plant Trait Database | https://www.try-db.org/ |
| GBIF Plant Occurrence Data | https://www.gbif.org/ |
| USDA PLANTS Database | https://plants.sc.egov.usda.gov/ |
| Global Biodiversity Information Facility (Plants) | https://www.gbif.org/dataset |
| Plant Phenotyping Dataset | https://www.plant-phenotyping.org/datasets |
| Zoology | GBIF Animal Occurrence Data | https://www.gbif.org/ |
| PanTHERIA Mammal Traits Dataset | https://esapubs.org/archive/ecol/E090/184/ |
| Animal Diversity Web Data | https://animaldiversity.org/ |
| Movebank Animal Tracking Data | https://www.kaggle.com/datasets/pulkit8595/movebank-animal-tracking |
| VertNet Vertebrate Dataset | https://vertnet.org/ |
| Microbiology | NCBI Genome Database | https://www.ncbi.nlm.nih.gov/genome/ |
| PATRIC Bacterial Bioinformatics Resource | https://www.patricbrc.org/ |
| IMG/M Microbial Genome Database | https://img.jgi.doe.gov/ |
| Human Microbiome Project | https://www.hmpdacc.org/ |
| MicrobiomeDB | https://microbiomedb.org/ |
| Genetics | NCBI Gene Database | https://www.ncbi.nlm.nih.gov/gene/ |
| 1000 Genomes Project | https://www.internationalgenome.org/ |
| GWAS Catalog | https://www.ebi.ac.uk/gwas/ |
| ClinVar Genetic Variants | https://www.ncbi.nlm.nih.gov/clinvar/ |
| OMIM Genetic Disorders Database | https://www.omim.org/ |
| Genomics | ENCODE Project Dataset | https://www.encodeproject.org/ |
| GenBank | https://www.ncbi.nlm.nih.gov/genbank/ |
| TCGA Genomic Data | https://portal.gdc.cancer.gov/ |
| UCSC Genome Browser Data | https://genome.ucsc.edu/ |
| ArrayExpress Genomics Data | https://www.ebi.ac.uk/arrayexpress/ |
| Molecular Biology | Protein Data Bank | https://www.rcsb.org/docs/general-help/organization-of-3d-structures-in-the-protein-data-bank |
| UniProt Protein Database | https://www.uniprot.org/ |
| BioGRID Interaction Dataset | https://downloads.thebiogrid.org/BioGRID |
| STRING Protein Interaction Data | https://string-db.org/ |
| Gene Expression Omnibus | https://www.ncbi.nlm.nih.gov/sites/GDSbrowser/ |
| Immunology | ImmPort Immunology Data | https://www.immport.org/ |
| IEDB Immune Epitope Database | https://www.iedb.org/ |
| Human Cell Atlas Immune Data | https://www.data.humancellatlas.org/ |
| Vaccine Adverse Event Reporting System | https://vaers.hhs.gov/data.html |
| FlowRepository Cytometry Data | https://flowrepository.org/ |
| Neurobiology | Allen Brain Atlas | https://brain-map.org/ |
| OpenNeuro | https://openneuro.org/ |
| Human Connectome Project | https://www.humanconnectome.org/ |
| Neurodata Without Borders | https://www.nwb.org/ |
| CRCNS Neural Data Repository | https://crcns.org/data-sets |
| Bioinformatics | NCBI Sequence Read Archive | https://www.ncbi.nlm.nih.gov/sra/docs/sradownload/ |
| KEGG Pathway Database | https://www.genome.jp/kegg/ |
| Reactome Pathway Dataset | https://reactome.org/download-data |
| BioMart Data Portal | https://asia.ensembl.org/info/data/biomart/index.html? |
| Zenodo Bioinformatics Datasets | https://www.ncbi.nlm.nih.gov/geo/ |
| Marine Biology | NOAA Oceanographic Data | https://www.nodc.noaa.gov/ |
| Coral Reef Monitoring Dataset | https://www.kaggle.com/datasets/jxwleong/coral-reef-dataset |
| World Ocean Atlas | https://www.ncei.noaa.gov/products/world-ocean-atlas |
| Marine Microbial Eukaryote Transcriptome Project | https://gold.jgi.doe.gov/sraexperiment?id=SRX554091 |
| Wildlife Biology | Movebank Wildlife Tracking Data | https://www.movebank.org/ |
| Global Biodiversity Information Facility | https://www.gbif.org/ |
| IUCN Red List Data | https://www.iucnredlist.org/resources/spatial-data-download |
| Wildlife Insights Camera Trap Data | https://www.wildlifeinsights.org/ |
| Living Planet Database | https://livingplanetindex.org/data_portal |
| Human Biology | UK Biobank | https://www.ukbiobank.ac.uk/ |
| Human Protein Atlas | https://www.proteinatlas.org/ |
| NHANES Health Dataset | https://www.kaggle.com/datasets/cdc/national-health-and-nutrition-examination-survey |
| Human Cell Atlas | https://www.humancellatlas.org/ |
| GTEx Gene Expression Dataset | https://gtexportal.org/home/ |