Search tool that only returns content created before ChatGPT's public release
TL;DR Highlight
A search tool showing only pre-November 30, 2022 human-written content to help you avoid AI-generated slop in search results.
Who Should Read
Developers, researchers, and anyone frustrated with AI-generated content flooding search results and degrading the quality of information found online.
Core Mechanics
- Filters search results to only content published before November 30, 2022 (ChatGPT's public launch date)
- Designed to surface original human-written content uncontaminated by AI generation
- Simple implementation: date filter applied to a standard search index
- Useful for finding original research, human opinions, and pre-AI baseline information
- Highlights the broader 'slop problem': AI-generated content now dominates certain search verticals
Evidence
- Tool demonstration showing higher-quality, human-written results vs. standard search
- Community discussion validating the value of the pre-ChatGPT cutoff date
- Anecdotal evidence of AI-generated content degrading search quality in technical and medical verticals
How to Apply
- Use this tool (or apply a date:before:2022-11-30 filter in your search engine) when you need trustworthy, human-written source material.
- For research tasks requiring post-2022 information, cross-reference AI-generated content against pre-2022 sources to check for accuracy.
- If you publish content, focus on genuine expertise and original analysis — this is increasingly differentiated value in an AI-saturated web.
Code Example
# How to search Google for content predating ChatGPT
# Use the before: operator without any additional extensions
https://www.google.com/search?q=your+search+term+before%3A2022-11-30
# Also works the same way on Mojeek
https://www.mojeek.com/search?q=your+search+term+before%3A2022-11-30Terminology
Related Papers
Can LLMs model real-world systems in TLA+?
LLM이 TLA+ 명세를 작성할 때 문법은 잘 통과하지만 실제 시스템과의 동작 일치도(conformance)는 46% 수준에 그친다는 걸 체계적으로 검증한 벤치마크 연구로, AI 기반 형식 검증의 현실적 한계를 보여준다.
Natural Language Autoencoders: Turning Claude's Thoughts into Text
Anthropic이 LLM 내부의 숫자 벡터(활성화값)를 직접 읽을 수 있는 자연어로 변환하는 NLA 기법을 공개했다. AI가 실제로 무슨 생각을 하는지 해석하는 interpretability 연구의 새로운 진전이다.
ProgramBench: Can language models rebuild programs from scratch?
LLM이 FFmpeg, SQLite, PHP 인터프리터 같은 실제 소프트웨어를 문서만 보고 처음부터 재구현할 수 있는지 측정하는 새 벤치마크로, 최고 모델도 전체 태스크의 3%만 95% 이상 통과하는 수준에 그쳤다.
MOSAIC-Bench: Measuring Compositional Vulnerability Induction in Coding Agents
티켓 3장으로 쪼개면 Claude/GPT도 보안 취약점 코드를 53~86% 확률로 그냥 짜준다.
Refusal in Language Models Is Mediated by a Single Direction
Open-source chat models encode safety as a single vector direction, and removing it disables safety fine-tuning.
Show HN: A new benchmark for testing LLMs for deterministic outputs
Structured Output Benchmark assesses LLM JSON handling across seven metrics, revealing performance beyond schema compliance.