Previously, Meta used a combination of warmup (to get players ready) and prefetch (to cache content on disk) for video delivery. While these methods helped improve network efficiency, they introduced significant challenges. Warmup required instantiating multiple player instances sequentially, which consumed significant memory and limited preloading to only a few videos. This high resource demand meant that a more scalable robust solution could be applied to deliver the instant playback expected on modern, fast-scrolling social feeds.
Integrating Media3 PreloadManager
To achieve truly instant playback, Meta’s Media Foundation Client team integrated the Jetpack Media3 PreloadManager into Facebook and Instagram. They chose the DefaultPreloadManager to unify their preloading and playback systems. This integration required refactoring Meta’s existing architecture to enable efficient resource sharing between the PreloadManager and ExoPlayer instances. This strategic shift provided a key architectural advantage: the ability to parallelize preloading tasks and manage many videos using a single player instance. This dramatically increased preloading capacity while eliminating the high memory complexities of their previous approach.
Optimization and Performance Tuning
The team then performed extensive testing and iterations to optimize performance across Meta’s diverse global device ecosystem. Initial aggressive preloading sometimes caused issues, including increased memory usage and scroll performance slowdowns. To solve this, they fine-tuned the implementation by using careful memory measurements, considering device fragmentation, and tailoring the system to specific UI patterns.
Fine tuning implementation to specific UI patterns
Meta applied different preloading strategies and tailored the behavior to match the specific UI patterns of each app:
-
Facebook Newsfeed: The UI prioritizes the video currently coming into view. The manager preloads only the current video to ensure it starts the moment the user pauses their scroll. This “current-only” focus minimizes data and memory footprints in an environment where users may see many static posts between videos. While the system is presently designed to preload just the video in view, it can be adjusted to also preload upcoming (future) videos.
-
Instagram Reels: This is a pure video environment where users swipe vertically. For this UI, the team implemented an “adjacent preload” strategy. The PreloadManager keeps the videos immediately after the current Reel ready in memory. This bi-directional approach ensures that whether a user swipes up or down, the transition remains instant and smooth. The result was a dramatic improvement in the Quality of Experience (QoE) including improvements in Playback Start and Time to First Frame for the user.
Scaling for a diverse global device ecosystem
Scaling a high-performance video stack across billions of devices requires more than just aggressive preloading; it requires intelligence. Meta faced initial challenges with memory pressure and scroll lag, particularly on mid-to-low-end hardware. To solve this, they built a Device Stress Detection system around the Media3 implementation. The apps now monitor I/O and CPU signals in real-time. If a device is under heavy load, preloading is paused to prioritize UI responsiveness.
This device-aware optimization ensures that the benefit of instant playback doesn’t come at the cost of system stability, allowing even users on older hardware to experience a smoother, uninterrupted feed.
Architectural wins and code health
Beyond the user-facing metrics, the migration to Media3 PreloadManageroffered long-term architectural benefits. While the integration and tuning process needed multiple iterations to balance performance, the resulting codebase is more maintainable. The team found that the PreloadManager API integrated cleanly with the existing Media3 ecosystem, allowing for better resource sharing. For Meta, the adoption of Media3 PreloadManager was a strategic investment in the future of video consumption.
By adopting preloading and adding device-intelligent gates, they successfully increased total watch time on their apps and improved the overall engagement of their global community.
Resulting impact on Instagram and Facebook
The proactive architecture delivered immediate and measurable improvements across both platforms.
-
Facebook experienced faster playback starts, decreased playback stall rates and a reduction in bad sessions (like rebuffering, delayed start time, lower quality,etc) which overall resulted in higher watch time.
-
Instagram saw faster playback starts and an increase in total watch time. Eliminating join latency (the interval from the user’s action to the first frame display) directly increased engagement metrics. The fewer interruptions due to reduced buffering meant users watched more content, which showed through engagement metrics.

Key engineering learnings at scale
As media consumption habits evolve, the demand for instant experiences will continue to grow. Implementing proactive memory management and optimizing for scale and device diversity ensures your application can meet these expectations efficiently.
Focus on delivering a reliable experience by minimizing stutters and loading times through preloading. Rather than simple disk caching, leveraging memory-level preloading ensures that content is ready the moment a user interacts with it.
Customize preloading behavior as per your apps’s UI. For example, use a “current-only” focus for mixed feeds like Facebook to save memory, and an “adjacent preload” strategy for vertical environments like Instagram Reels.
Integrating with Media3 APIs rather than a custom caching solution allows for better resource sharing between the player and the PreloadManager, enabling you to manage multiple videos with a single player instance. This results in a future-proof codebase that is easier for engineering teams to not only maintain and optimize over time but also benefit from the latest feature updates.
Broaden your market reach by testing on various devices, including mid-to-low-end models. Use real-time signals like CPU, memory, and I/O to adapt features and resource usage dynamically.
Integrating with Media3 APIs rather than a custom caching solution allows for better resource sharing between the player and the PreloadManager, enabling you to manage multiple videos with a single player instance. This results in a future-proof codebase that is easier for engineering teams to not only maintain and optimize over time but also benefit from the latest feature updates.
Broaden your market reach by testing on various devices, including mid-to-low-end models. Use real-time signals like CPU, memory, and I/O to adapt features and resource usage dynamically.











