Unsupervised Learning
- Unsupervised learning
- Training LLMs like GPT-4
- considered as UL, or commonly as self-supervised learning (subset of UL)
- dont have explicit labels or target outputs, but rather learning to predict next word in a sentence, given the previous words
- using the input text itself as both input and output
- data: given input x, but not output labels y
- eg. data on patients, tumorβs size, patientβs age
- goal: find sth interesting / patterns / structure in unlabeled data, not whether benign or malignant
- clustering algo:
- visualization
- place data in different clusters
- eg. gg news, group related stories together by keywords, in which the algo figure out by itself without supervision
- will also learn
- anomaly detection: find unusual data points
- dimensionality reduction: compress data using fewer numbers