About GPT-4.1
GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.
Specifications
- Provider
- OpenAI
- Context Length
- 1,047,576 tokens
- Input Types
- image, text, file
- Output Types
- text
- Category
- GPT
- Added
- 4/14/2025
Benchmark Performance
How GPT-4.1 compares to its closest rivals across industry benchmarks
Evaluates AI ability to resolve real GitHub issues from Python repositories
Metric: % Resolved
#14
Qwen3-Coder 480B/A35B Instruct
55.4
#16
Claude 3.7 Sonnet (20250219)
52.8
#21
gpt-oss-120b
26.0
Frequently Asked Questions
Common questions about GPT-4.1