Xiang Zhang
fancyzhx
AI & ML interests
None yet
Organizations
None yet
Video Datasets
Text Datasets
-
Running132
TxT360: Trillion Extracted Text
📖132Explore and analyze the TxT360 dataset for LLM pre-training
-
CASIA-LM/ChineseWebText2.0
Viewer • Updated • 2k • 1.76k • 28 -
HPLT/HPLT2.0_cleaned
Viewer • Updated • 9.03B • 32.7k • 36 -
TrevorDohm/Pile_Tokenized
Viewer • Updated • 134M • 326
Audio Datasets
Robotic Datasets
Video Datasets
Image Datasets
Text Datasets
-
Running132
TxT360: Trillion Extracted Text
📖132Explore and analyze the TxT360 dataset for LLM pre-training
-
CASIA-LM/ChineseWebText2.0
Viewer • Updated • 2k • 1.76k • 28 -
HPLT/HPLT2.0_cleaned
Viewer • Updated • 9.03B • 32.7k • 36 -
TrevorDohm/Pile_Tokenized
Viewer • Updated • 134M • 326