Midjourney‘s AI research expands beyond image generation into creative text generation, tackling a fundamental challenge in LLM outputs. The company has partnered with NYU researchers to develop new techniques that enhance diversity in AI-generated text while maintaining quality, marking an important advancement in making AI writing more imaginative and less predictable.
The big picture: Midjourney, with its massive user base of nearly 20 million on Discord alone, is venturing beyond its core image generation business by tackling limitations in AI-generated creative writing.
- The company collaborated with NYU researchers to develop two new techniques called Diversified Direct Preference Optimization (DDPO) and Diversified Odds Ratio Preference Optimization (DORPO).
- These methods specifically address the tendency of large language models to produce homogeneous, predictable text when asked to generate creative content.
The technical innovation: The research introduces “deviation” as a training metric to guide language models toward producing more varied and creative outputs.
- Unlike existing diversity-promoting techniques that operate only at inference time, these new methods integrate diversity considerations directly into the model’s training process.
- The researchers published their findings in a paper on Hugging Face, complete with implementation details available on GitHub.
Key findings: Tests showed that DDPO significantly outperformed standard approaches in balancing quality with diversity.
- Llama-3.1-8B models trained with DDPO achieved the best balance of output quality and creativity.
- Remarkably, even with reduced dataset sizes, DDPO-trained models maintained their ability to generate diverse outputs.
Why this matters: Creative writing fundamentally differs from factual or code generation tasks because it benefits from multiple valid approaches rather than a single correct answer.
- Standard LLM training methods often prioritize user preference over originality, resulting in safe but repetitive content.
- By addressing this limitation, Midjourney’s research could lead to AI systems that better support human creativity rather than merely producing predictable, templated content.
Behind the numbers: Midjourney’s expansion into LLM research comes after the company revealed plans to develop its own computing and AI hardware in late summer 2024.
- With “nearly 20 million users on its Discord channel” according to third-party trackers, Midjourney has the scale and user base to potentially apply these creative writing improvements across multiple generative AI domains.
Midjourney’s surprise: new research on making LLMs write more creatively