Jukebox is a neural network that generates music, including rudimentary singing, as raw audio in various genres and artist styles. Developed by OpenAI, Jukebox leverages generative artificial intelligence to create new music samples from scratch, conditioned on genre, artist, and lyrics.

Key features:
- Generates music in a variety of genres and styles, including singing
- Generates music based on provided lyrics, aligning the audio with the lyrical content
- Allows exploration of curated and uncurated samples generated by the model
- Generates high-fidelity audio by compressing and upsampling raw audio using a hierarchical VQ-VAE approach
- Can be steered to generate music in the style of specific artists or genres

How it works:
Jukebox operates by compressing raw audio into a lower-dimensional space using a VQ-VAE (Vector Quantized Variational Autoencoder) and then generating new audio in this compressed space. The process involves:
1. Compressing the raw audio into discrete codes using a hierarchical VQ-VAE model
2. Generating new music using transformer models trained to predict the distribution of music codes
3. Upsampling the generated codes back to the raw audio space
4. Conditioning the model on additional information such as artist, genre, and lyrics during training

Jukebox is designed as a research tool and is available as open-source code, which can be integrated into various applications by developers.

Use of AI:
Jukebox leverages generative AI by using a combination of VQ-VAE for compression and transformers for generation.

AI foundation model:
The foundation model is a hierarchical VQ-VAE combined with transformer-based priors, built on the principles of autoregressive modeling and sparse transformers.

How to access:
Jukebox is available as open-source code, allowing researchers and developers to experiment with and build upon the model. The model weights and code are publicly available, along with a tool to explore the generated samples.

