Is something cooking?

Djay automatically adapts its separation algorithm based on the available computing power of the device it’s running on. It’s almost as if it had several quality presets that it activates adaptively. Therefore, you won’t get the same separation quality on a 2016 Android smartphone as you would on an iPad with an M1 processor.

Djay runs on any type of device, but the final rendering quality is not at all the same depending on what you’re running it on.

Where can i read more information about this process?

According ChatGPT (again)

The fact that djay Pro (by Algoriddim) manages to do stem separation across a wide range of devices, including older smartphones from 2016, is due to a combination of smart optimizations and dynamic adaptation to available hardware.

Here’s a breakdown :backhand_index_pointing_down:


:white_check_mark: 1. Use of adaptive / fallback AI models

Yes, they are almost certainly using multiple versions of models with varying complexity and quality levels.

:repeat_button: The more powerful the device, the larger and more precise the model used. :repeat_button: The less powerful, the lighter and more approximate the algorithm.

So:

  • An iPhone 14 Pro Max likely uses a heavier, high-quality model, similar to Demucs v4 or Stems 2.0

  • A 2016 Android phone may use a compressed or simplified model, possibly doing only basic separation (e.g., vocals vs instrumental)

:light_bulb: This is the same approach used by apps like Deezer’s Spleeter, Meta’s Demucs, or OpenAI’s Whisper (tiny, base, medium, etc.).


:white_check_mark: 2. Quantization, pruning, and optimization for CPU/NPU

To run on less powerful phones:

  • Models are usually quantized to INT8, or even INT4 for extremely lightweight versions

  • They are pruned (removing unnecessary parameters)

  • They’re optimized for:

    • ARM CPUs

    • NPUs when available (e.g., Apple Neural Engine, Snapdragon Hexagon DSP)

    • Mobile GPUs via Metal (iOS) or Vulkan (Android)


:white_check_mark: 3. “Good enough” real-time stem separation

djay Pro’s goal is not scientific perfection, but a smooth user experience.

:backhand_index_pointing_right: They likely:

  • Use post-processing tricks to hide imperfections

  • Implement temporal caching (e.g., pre-processing the first few seconds)

  • Add dynamic crossfades or EQ to give the illusion of better separation


:white_check_mark: 4. Hybrid processing: local + cloud (optional)

Some apps (like Moises.ai or VirtualDJ in offline analysis mode) offer two modes:

  • :brain: Local: fast, lightweight processing with decent quality

  • :cloud: Cloud: offloads the task to powerful servers for high-end results

Even though djay Pro appears to run everything locally, they could still:

  • Use background pre-processing

  • Cache some analysis

  • Or potentially fall back to cloud under specific conditions


:bar_chart: Hypothetical multi-tier model (simplified)

Target Device Model Type Separation
iPhone 6s (2015) tiny-unet-int8 Vocals vs rest
iPhone XR (2018) spleeter-lite 2–4 stems, medium
iPad Pro M1 demucs v4 hybrid 5 stems, high quality

:bullseye: In summary:

:white_check_mark: Yes, djay Pro most likely uses an adaptive architecture with:

  • Compressed or simplified models depending on the device

  • Smart hardware utilization (NPU, CPU, GPU)

  • Perceptual optimizations that make the output seem better than it may technically be

This flexibility allows them to run well on devices from 2016 all the way to current M1/M2 iPads.

Edit from me: That said, for a sampler-type device like an MPC, choosing an RK3588 may be appropriate in that you’re only processing relatively short samples (10s, 20s, 30s, etc.) and you don’t necessarily need on-the-fly separation.

But for DJ equipment where you expect on-the-fly separation as soon as the track is loaded, with track lengths of several minutes, that’s a completely different matter.

So there’s no actual data provided by Algoriddim to support this?

:magnifying_glass_tilted_left: Relevant official sources

  1. Algoriddim Press Release — djay Pro 5 / Neural Mix with AudioShake

    • In their December 2023 announcement, Algoriddim explicitly states:

      “Fitting large AI models onto devices, in real-time, and without a loss in quality, is a big technical challenge.”

    • They also mention their partnership with AudioShake, which suggests they’re using professionally recognized source separation technology and working to make it compatible with mobile devices. (Source)

  2. Press Release – Neural Mix Pro (2020)

    • They explain that Neural Mix Pro uses Apple’s Core ML on macOS, enabling real-time separation of vocals, instruments, etc.

    • The use of Core ML confirms they rely on device-specific acceleration (GPU/CPU/NPU) when available. (Source)

  3. Press Release – djay Pro AI / Neural Mix (2020)

    • They say their Neural Mix AI system separates music into its original components on-the-fly.

    • They also recommend devices with at least an A12 Bionic chip for good performance, implying scalability depending on hardware. (Source)

  4. djay Pro 5.2 announcement (Windows on ARM)

    • They confirm that on ARM-based Windows devices, djay Pro uses the NPU (Neural Processing Unit) for Neural Mix, which clearly shows that hardware acceleration is dynamically leveraged. (Source)

:warning: What these sources don’t explicitly say (but can be reasonably inferred)

  • They don’t publicly share exact model sizes, latency benchmarks, or TOPS requirements per device.

  • Internal implementation details (e.g., which model is used on which device, quantization levels, pruning, etc.) are not disclosed in depth.


:white_check_mark: Conclusion

Yes, there is clear official confirmation that:

  • djay / Algoriddim uses real-time source separation models (Neural Mix) on mobile and desktop devices.

  • They adapt performance and quality based on available hardware (as confirmed by Core ML use, NPU acceleration, and chip recommendations).

Thats just more AI slop, using words like ‘imply’ immediately dismisses it as not official.

Surely if they have confirmed it, there will be a simple link to an official post explaining how it works?

They don’t communicate clearly on this topic or how it works under the hood, but the only way to achieve this is to use different algorithm models, some lighter but with more approximate quality, and others heavier with a higher level of rendering quality.

There’s no miracle.

Perhaps InMusic is considering a hybrid solution for the next platform. Use an RK3288 as the SoC but condition the real-time stem separation on a remote server based in the cloud, as DJay can do on certain devices when they are connected. This is done completely transparently for the user but will obviously require a good Wi-Fi or 5G connection to run quickly and smoothly.

So you could have local separation with medium quality, and high-quality separation based in the cloud.

But if this is the approach they’ve chosen, then expect InMusic to ask you for a subscription for cloud-based stem splitting services. I doubt they’ll provide cloud servers for free to every user.

Interesting (and a bit disappointing) to see that the battery size in the Live III has shrunken considerably. Old one was 14.8V 3350mAh (=49.6Whr). New one seems to be 14.4V 2750mAH (=39.6Whr). Being rather small with 40x130mm in size, it seems that the rest of the unit is crammed with electronics and the speaker drivers.

The RK3588 is beefier and actively cooled (RK3288 has a chunky passive cooling block) so it should draw more power, unless being downclocked and optimized in other ways. Would be a shame if the Live III gets less than 5-6hrs of playtime.

Perhaps InMusic is considering a hybrid solution for the next platform. Use an RK3288 as the SoC but condition the real-time stem separation

They won’t build in the 3288 again, I am very sure. That boat has sailed, by now it’s simply too outdated. It’s easier for the whole ecosystem (Denon, Numark, Akai, etc.) to focus on one, stronger chip as foundation.

As for the AI stuff or real-time stems, I already stated once, I regard it as secondary importance, considering any serious DJs prepares the music on a (much stronger) laptop anyway, where is more convenient to simply batch-process all stems ahead and then play them. The RK3588 chip is still great to have for all other tasks, including much faster track analysis and maybe ‘lower quality / simplified’ stems for streaming tracks, which you’re more likely play on private parties or maybe a friend’s weddings, rather than big clubs or festivals (and even there most DJs don’t use Stems yet due to quality limitations, I am sure).

The RK3288 uses an 8nm process compared to 28nm for the 3288. So it should logically consume less power, or require a smaller battery to achieve roughly the same battery life.

This is your point of view only. Personally, I already hate spending hours preparing to place my hot cues and organize my playlists, so if I have to pre-render my stems, it’s a dealbreaker for me.

I like having the ability to load a track at any time that I hadn’t necessarily thought of in my preparation, because I leave a lot of room for improvisation and I use stems on almost every track I load in open-format sessions with Virtual DJ.

Pre-rendering isn’t something that suits my workflow, so are you saying I’m not a serious DJ? You can be a very good DJ when you leave room for improvisation to surprise your audience.

Just because pre-rendering works for you doesn’t mean it works for everyone. So your point of view isn’t a universal truth for all DJs on this planet.

Relax your tone, I never stated that my point is ‘universal truth’. Also, you stating ‘your point of view only’ isn’t exactly sensible either.

My personal view is that many (I never said all) DJs will rather make use of their existent, more powerful laptop/desktop chips (the vanilla M1 alone is still more powerful than the Rk3588) where their main library is located and the whole preparation routine is convenient to do, as you can do it all at once, export, and perform. Just like many/most did in the recent 20 years in Rekordbox or Engine.

You can still add improvisation like spontaneous tracks (as I said, streaming for example) and there the RK3588 will certainly be far more capable than the 3288. I never said this wouldn’t be useful. I just stand by my point that this chip is probably not meant to replace an entire 20k track library analyzing & preparation pipeline/workflow - which, in my eyes, is absolutely okay. Preparation + adding spontaneous track/streaming/ideas can complement each other perfectly fine.

Personally I am very happy that it’s the 3588 (confirmed by the FCC leaks) and not some weaker variant, like the 3576. Live 3 comes with 8GB/128GB which is a massive jump. L1/L2/L3 caches also got a nice increase, and bandwidth will be higher, too, including the I/O.


The RK3288 uses an 8nm process compared to 28nm for the 3288. So it should logically consume less power, or require a smaller battery to achieve roughly the same battery life.

A smaller process node doesn’t automatically mean lower power draw, if the chip’s architecture/layout and performance has been expanded, which is the case here. It also has an active fan now, as I already stated, and I just found a leaked store listing where ‘up to 4hrs’ of playtime are listed. The Live 2 had ‘up to 6hrs’. So yeah, certainly not the same. But mostly due to the smaller battery size and added LEDs in the buttons. If were are lucky, the SoC’s power draw remains roughly the same and the fan runs at very low speed, or short bursts, like track analysis or AI stuff.

It’s definitely the case that the quality of stem separation (Neural Mix) in Algoriddim’s Djay Pro varies between devices, based on their internal hardware.

For instance, the PC version doesn’t provide full quality separation, but the Mac version does.

VirtualDJ on the other hand will give you the quality regardless. You might have to wait longer, but it will do it.

The fact is, personally, I’d be pretty disappointed if we had an RK3588 in the next generation of DJ equipment, which seems likely since the hardware is generally shared with Akai.

It’s still a chip under $60, and that’s the unit price, not a wholesale price, as you can see here.

https://www.chipmall.com/keywords/RK3588

So I’ll let you imagine what price InMusic must be getting by ordering it by the thousands, probably around $35-40 each.

So once again, we’ll end up with a $40 processor in a device that will cost €2,000 for the high-end models, simply because there’s a shared platform between all the devices.

While I can understand that entry-level devices like the MixStream or Prime Go at $800/1000 are justified, for someone buying a high-end AIO for over $2,500 or a flagship gamer for $1,700, you end up with a 2022 SoC for $40, identical to the entry-level product.

Would you buy a $2,500 Dell laptop that had the same CPU as an $800 Dell laptop?

InMusic has once again cut corners with a processor that’s underpowered for AI-based applications, which are clearly the technologies of the future.

If it’s an RK3588 on the next platform, I’d definitely pass on it because it won’t give me anything better than what I’m getting with my LC6000 and my i5 + RTX 4060 laptop, which cost me less than €900.

That’s because macs have blast processing.

That’s because Algoriddim favour Apple products. There’s no video mixing on the PC version either, or tag editing. Meanwhile other vendors offer an equal feature set regardless of platform. It’s been mentioned many times by reviewers.

Algoriddim and Apple are very bonded. Wasn’t the CEO of Algo an ex-employee of Apple?

Anyways, it’s very optimized for Apple silicon.

I see you speculate as much as I do.

If you’re basing this on raw processing then sure. You’re forgetting that coding makes that hardware preform, I have faith Denon can make it happen. Akai did.

Well, it will be the RK3588 since a) from a cost-saving and development perspective it makes perfectly sense to use the same chip → ISA → kernel across all products, just like now, and b) there simply wasn’t any newer or faster Rockchip available in 2022/2023 when the unit was, as I guess, in final development (my battery image above shows 2022 as production date). Actually, there still isn’t any official successor either, it’s their most powerful chip. And no, it isn’t as trivial to just switch to a completly different chip. Hence, given these circumstances, I am totally fine with this pick, since it is the most powerful one.

While I partially agree with your production costs vs selling price argument, (and I also find anything around $2000 for a Live3 too expensive), we can’t compare (rather) niche DJ or production devices with mass-market laptops. That’s a completly different scale, and many other R&D aspects, like coding, as @djdragon mentioned. The LC6000 combined with a laptop is a perfectly fine and legit alternative, if you don’t need standalone. Laptops evolve faster, pure software also evolves faster - specialized hardware doesn’t. But for me, it’s just more fun to use.

Some may argue that the 6 TOPs NPU limit that chip’s future-proofing, and if AI tasks are all (or most) what matter to you as DJ or producer, you might be right. But if everything else matters (aka actually doing/performing music), even the current, quite old RK3288 does still a solid job. When Akai and Denon released their first products with said chip (Live/X and SC5000) in 2017, the whole comptetition was far more underpowered. Pioneer still had their lousy 2000NXS2 with their 20fps screens and rudimentary firmware (rather than software), and all other synths and grooveboxs had their RAM measured in Megabytes, rather than Gigabytes - some of them still are today - and no, they still don’t come cheaper. Deluge, Elektron, Roland just to name a few, don’t even need UI teams, such limited are their screens, for example.

We have to also add all the additional hardware around the CPU, like RAM (we don’t know what will be available for the Denon players / all in ones). RK3588 can have up to 32GB or as low as 4. There can be multitude of peripherals to make the whole unit work / do stuff. We will see when we get there…

That is as accurate as it can ever be. For example: Elektron Digitakt 2 - 400MB’s of RAM, unit is 1000EUR.

I think the joke went over your head, or perhaps maybe I’m too old to remember the sega-nintendo 16 bit wars in which your comment brought back memories of. :laughing: