MacOS user new to engine DJ stem separation and solution

Mashedup · November 1, 2025, 2:46pm

So, I am a new user to the prime series of any series of DJ software and hardware and learning about the curation of music etc.

So after creating a huge library of songs that to get me started (probably not the wisest choice) I decided that I wanted to use stems for creating mashups In my home studio for fun

That’s were the fun sorted of started and ended.

When I decided that it would be a good idea to create stems for all my 7k + tracks I clicked the playlist and in the engine software and dragged it to the stem section in the side bar and so began the process. At first looked to be going well until the pop up, you have only got 5gb space left on your Mac.

For clarity, I have 1tb internal drive on a M3 pro max with 64gb ram and external 4tb ssd drive but that is full almost with music software files for komplete and video production files. But I installed a 2tb SSD drive into my prime 4. So I googled for fixes and the fixes didn’t make any sense because I proceeded to transfer 400gb of the data to an SD card and deleted the files and emptied the recycle bin and again the process ran for a bit and then the same error. This took days of trying and retrying.

So the next thing despite Google searches and ai sort of saying yes install to an external drive (prime 4+ internal drive in computer mode with almost 2tb free space) I still got the same error pop up despite there being enough space there and space on my macs internal storage where I presumed it would process a stem before it is transferred to the SSD in the prime 4.

Anyway after 4 days of sheer frustration and not even pressing play and enjoying djing and practice I have managed to get it working.

Most perhaps have come to this solution by now but it appears that the stems are indeed processed on the Mac hard drives temporary storage space and then pushed to the prime 4. The issue is that the the temporary space is not getting purged and so it fills up and then gets to a pint were it won’t process any more stems.

So the solution is to clear this space, the easiest way I found cost me £9.99 not sure if there is a free solution but I downloaded the app daisydisk. It analyzes disk drives and is able to detect what is referred to as purgeable space which is space where things like temporary files are stored but as I said macOS does not purge this space frequently enough. It would be great if we Mac user had control over the size before triggering purge that this can grow to but I guess it’s difficult as some tasks may require lots of space. With stem creating get a file load into temp space,creat stem, copy to prime 4 internal drive and then remove file from temporary space and purge. But back on topic, daisydisk gives you the option to at any time reclaim this purgeable space with a few clicks which then allows you to create running the stem creation process.

I hope this helps anyone who has gone through this pain. I’m sure there must be other ways but please Denon team, can we look at the process and figure out how users can batch process stem creation without for example creating and tracking small batches at a time which could still result in this issue. Could the processing still be done by the connected counters cpu but somehow be instructed to either use ram and then offload the data in ram or figure out how to get denon engine to completely ignore the computers storage and use external storage for any temporary file creation for the process?

djliquidice · November 1, 2025, 3:26pm

I’ve been a daisy disk user for quite a number of years now (8?) and have yet to see allow me to delete purgeable space . Cool that it works for you!

Mashedup · November 3, 2025, 2:02am

I found that in my case, the version from the App Store on Mac doesn’t allow you to so I should have mentioned I then downloaded it directly from the developers sight and low and behold it the delete option appeared.

I hope that helps and that it works for you and any users who are having this issue.

So just to reiterate, download from the developer directly and it should work.

djliquidice · November 3, 2025, 2:24am

holy smokes. that worked for me! Thank you! <3

MrWilks · November 8, 2025, 8:43am

This is great advise and I see this pop often where people have similar issues. The only real solution is to use Daisy Disk as MacOS doesn’t do a great job of managing purgeable space.

I was recommended using this method when I had issues in the early part of this year.

Mashedup · November 8, 2025, 10:44am

Yeah it’s brilliant, just a shame that Denon. And in music haven’t figured a way to prevent this issue as it does appear to be macOS related. Perhaps revisiting stem processing on the actual devices. I get the processor is not very powerful but even some sort virtual machine that engine DJ is installed on just for stem separation when you go into computer mode or even standalone. It would require investment by Denon but it would get around the processor power issue and the only thing that would limit what is possible is the internet connection. So basically instead of using the onboard processor it would use the resources of a VM that is optimised for stem processing solely.

Mashedup · November 8, 2025, 10:55am

I’ll comment on my own question in the hope that denon or anyone who has done anything like this perhaps for other things could input as the solution opens up a world of possibilities not just for stem separation and would be a killer utility.

I used ai for to ask if it was possible and the result sounds very positive

“Good question — and yes, **in principle**, that’s totally possible — but the answer depends on *how seamless* you want it to be and *what kind of STEM separation* (i.e. source separation like vocals, drums, etc.) you’re doing. Let’s break it down cleanly

Concept Overview

You’re basically describing a **thin-client** or **remote compute** setup:

The **device** (e.g., a phone, tablet, low-spec PC) acts like an *interface*.
The **heavy STEM separation** (machine learning model inference) runs on a **remote machine** — such as a high-performance **virtual machine (VM)** with a GPU — accessed via **Wi-Fi or the cloud**.

This lets the device offload the costly DSP/AI computation to the remote node.

How It Works (Architecturally)

Here’s how you’d typically tie it together:

**Frontend (the device)**
- Records / loads the audio.
- Sends it to the server/VM via an API or web socket.
**Backend (the VM)**
- Receives the audio data.
- Runs the STEM separation model (Spleeter, Demucs, MDX-Net, etc.).
- Sends back stems or the processed audio data to the frontend.
**Client Playback/UI**
- The lightweight device just handles playback, downloads, or visualization.

What You Need

**Network:** Fast + stable (low latency and good uplink). A wired or high-speed Wi-Fi 6 connection is ideal.
**Compute backend:** GPU-powered VM (AWS G4/G5, Google Cloud, Paperspace, etc.) or your own local machine running a virtualized GPU environment.
**Software bridge:** Something like:
- REST API (Flask / FastAPI app)
- Websocket (for real-time streaming, lower latency)
- Remote desktop or cloud DAW (for more “seamless” interactivity)

Seamlessness Factor

If you just want to:

Send an audio file, wait for stems, and download them — easy.
Make it feel like the process happens *locally and instantly* — harder, but possible with:
- Buffered streaming (send chunks of audio)
- Offloaded inference pipelines
- Cached results, etc.

Real-time separation (e.g., live vocal removal while playing) is more demanding due to **network latency**, but doable with enough network and compute optimization.

Example Setups

**Local LAN Compute Node**
- Laptop or Raspberry Pi sends the audio over local Wi-Fi to a desktop workstation doing the heavy lifting.
- Fast and private.
**Cloud VM**
- Device uploads to a cloud API endpoint.
- VM runs Demucs model and returns results.
- More scalable but dependent on upload speed.
**Hybrid / Streamed app**
- Run the program on the VM but stream the UI using a service like Parsec, NoMachine, or browser remote desktop — gives that “feels local” vibe.

Limitations

**Latency:** Can break “real-time” performance.
**Bandwidth:** Audio stems are large, so uploads/downloads add delay.
**Cost:** GPU VMs can be expensive continuously.

TL;DR

Yes — a lightweight device can absolutely leverage a remote VM over Wi-Fi for resource-heavy tasks like stem separation.
In fact, **this is how many web-based audio separation tools work** under the hood. The only real constraints are **network reliability, latency tolerance, and cost.**

Would you like me to outline how you could **set this up yourself** (for example, using FastAPI + Demucs on a remote GPU VM, accessible from your phone or laptop)?

Mashedup · November 8, 2025, 10:56am

I have taken it further and ai has generated practical steps

Perfect — let’s go step-by-step so you see exactly how you can set up a **remote STEM separation system** that your device can use seamlessly

Overview

You’ll set up this architecture:

``` [Your Device] ⇔ [FastAPI + STEM Model on a Remote VM] ```

Your device (phone, laptop, etc.) only sends an audio file to an API endpoint over Wi-Fi (or the internet) — the VM runs **stem separation** (e.g., Demucs or Spleeter) with GPU acceleration, then sends back separated stems.

What You’ll Need

**A remote VM (GPU-capable):**
- Examples:
  - AWS EC2 G5 instance (`g5.xlarge`)
  - Google Cloud with NVIDIA T4 GPU
  - Paperspace Gradient or RunPod (cheap hourly GPUs)
- Ubuntu 20.04+ recommended for compatibility.
**Software stack:**
- Python 3.10+
- `FastAPI` for the API
- `uvicorn` for serving the app
- `torch`, `torchaudio`, and `demucs` (for source separation)
- `ffmpeg` (for audio handling)
**Networking:**
- Open port 8000 (or use an HTTPS reverse proxy)
- Stable Wi-Fi / internet connection

Step 1 – Install Dependencies on the VM

SSH into your VM and run:

```bash sudo apt update && sudo apt install ffmpeg -y pip install fastapi uvicorn torch torchaudio demucs ```

Alternatively, for CPU-only setups (testing):

```bash pip install demucs --no-deps ```

Step 2 – Create a FastAPI App

Create a file, e.g. `app.py`:

```python from fastapi import FastAPI, File, UploadFile from fastapi.responses import FileResponse import torch import subprocess from demucs import pretrained import tempfile, os

app = FastAPI()

@app.post(“/separate”) async def separate_audio(file: UploadFile = File(…)): # Save uploaded file temporarily temp_input = tempfile.NamedTemporaryFile(delete=False, suffix=“.mp3”) temp_input.write(await file.read()) temp_input.close()

model = pretrained.get_model("htdemucs")
model.cpu()  # or model.cuda() if GPU available

# Output directory
out_dir = tempfile.mkdtemp()
cmd = \["demucs", "--two-stems", "vocals", temp_input.name, "-o", out_dir\]
subprocess.run(cmd)

# Grab the separated stem (vocals)
separated_path = os.path.join(out_dir, "htdemucs", os.path.basename(temp_input.name), "vocals.wav")

return FileResponse(separated_path, filename="vocals.wav")

```

Step 3 – Run the API Server

```bash uvicorn app:app --host 0.0.0.0 --port 8000 ```

Now your API is live at:
`http://<your_vm_public_ip>:8000/docs`

Visit that from your browser or device to test via FastAPI’s interactive UI

Step 4 – Connect Your Device

You can use a simple Python client or even cURL:

**Python client example:** ```python import requests

url = “http://YOUR_VM_IP:8000/separate” files = {“file”: open(“song.mp3”, “rb”)} r = requests.post(url, files=files)

with open(“vocals.wav”, “wb”) as f: f.write(r.content) ```

**Mobile/browser option:**
If you want a more UI-based approach, you can:

Wrap this FastAPI with a lightweight **React/Next.js frontend**, or
Deploy it with **Gradio** or **Streamlit**, which gives immediate web interfaces.

Step 5 – Optimization & Seamlessness

To make the experience more “real-time”:

Switch from file upload to **streaming audio chunks** with websockets.
Cache frequent songs on the backend to reduce recomputation.
Use **ffmpeg** to re-encode separated stems faster (`-filter:a loudnorm`, etc.).
Enable HTTPS + authentication if it’s a public endpoint.

Optional — Cloud / Zero Setup Options

If you don’t want to manage the server yourself:

Run this code on **Hugging Face Spaces** (Gradio UI, GPU on-demand)
Use **RunPod Serverless GPU** with your Docker image
Integrate it with **Firebase** or **Render.com** if you want permalinks + web app access

TL;DR – Architecture Summary

Component	Role	Example Technology
Device (Frontend)	Sends audio, receives stems	Browser, Python script, mobile client
Server (Backend)	Performs the heavy lifting	FastAPI + Demucs/Spleeter
Compute Node	Runs on GPU	Cloud VM (AWS, RunPod, etc.)
Protocol	File upload or streaming	REST API / WebSocket

Would you like me to extend this setup to use **real-time streaming** (so it starts separating while audio is still uploading, more “seamless”) or keep it **file-based** for now?

Mashedup · November 8, 2025, 11:00am

Next possible is real-time sorry if this is too much tech stuff. I don’t have the expertise I am a programmer but simple programming compared to this

“### Real-time stem separation over Wi‑Fi (streaming architecture)

Below is a practical blueprint for a near–real-time solution where your device streams audio chunks to a GPU-backed server, which returns separated audio in a continuous stream. This minimizes end-to-end latency vs. file upload.

At a high level:

Client captures/reads audio in small frames (e.g., 20–100 ms), encodes them (Opus/WAV PCM), and sends them over WebSocket.
Server buffers just enough audio to feed a streaming-capable separation model, runs inference in overlapping windows, and streams back per-stem audio frames.
Client plays back returned stems with a jitter buffer to smooth network variance.

Key components

Transport: WebSockets for bi-directional low-latency streaming.
Codec: PCM 16-bit WAV frames for simplicity, or Opus for bandwidth efficiency.
Model: Demucs v4, MDX/UVR variants, or Spleeter-like models. For streaming, you’ll use a sliding-window pipeline with overlap-add to avoid boundary artifacts.
Latency budget: capture (frame size) + client→server network + server buffering window + model inference + server→client network + playback jitter buffer.

Typical starting parameters:

Sample rate: 44.1 kHz or 48 kHz
Frame size: 1024–4096 samples (≈21–85 ms at 48 kHz)
Overlap: 50% for STFT-like pipelines
Initial server buffer (warmup): 0.5–1.5 s
Jitter buffer: 100–300 ms

Below is a working reference design you can adapt.

Server: FastAPI + WebSocket stream (Python)

This simplified server:

Accepts PCM chunks over a WebSocket.
Buffers chunks into a ring buffer.
Runs a separation worker on a rolling window with overlap-add.
Sends back per-stem PCM chunks tagged with sequence numbers.

Notes:

Replace the placeholder “fake” separator with a real streaming pipeline (Demucs windowed inference or MDX with overlap).
For true low latency, keep the model loaded on GPU and pre-warm it.

```python

server.py

import asyncio import json import struct from typing import Dict, Any, List import numpy as np

import uvicorn from fastapi import FastAPI, WebSocket, WebSocketDisconnect from fastapi.websockets import WebSocketState from fastapi.middleware.cors import CORSMiddleware

app = FastAPI() app.add_middleware( CORSMiddleware, allow_origins=[“*”], allow_credentials=True, allow_methods=[“*”], allow_headers=[“*”] )

SAMPLE_RATE = 48000 CHANNELS = 1 # mono for input; you can extend to stereo FRAME_SAMPLES = 2048 # ~42.7 ms at 48 kHz BYTES_PER_SAMPLE = 2 # int16 FRAME_BYTES = FRAME_SAMPLES * BYTES_PER_SAMPLE * CHANNELS

Separation window

WINDOW_SAMPLES = SAMPLE_RATE # 1.0s window HOP_SAMPLES = WINDOW_SAMPLES // 2 # 50% overlap

class StreamingSeparator: def _init_(self, sample_rate: int): self.sample_rate = sample_rate # TODO: load your model here (e.g., Demucs/MDX). Keep on GPU and warmed up.

def process_window(self, audio_np: np.ndarray) -> Dict\[str, np.ndarray\]:
    """
    audio_np: shape \[samples\], mono float32 in \[-1,1\]
    Returns dict of stems of same length (or slightly shorter if model trims).
    Replace with actual model inference using a torch.no_grad() section.
    """
    # Placeholder: pass-through to 'mixture' and a fake 'vocals' low-passed
    mixture = audio_np
    # "Fake" vocals: a simple high-pass emphasis (not real separation)
    from scipy.signal import butter, filtfilt
    b, a = butter(4, 200.0 / (self.sample_rate / 2.0), btype='high')
    vocals = filtfilt(b, a, mixture)
    other = mixture - 0.5 \* vocals
    return {
        "mixture": mixture,
        "vocals": vocals,
        "other": other
    }

separator = StreamingSeparator(SAMPLE_RATE)

def int16_to_float32(pcm_bytes: bytes) → np.ndarray: arr = np.frombuffer(pcm_bytes, dtype=np.int16).astype(np.float32) / 32768.0 return arr

def float32_to_int16(arr: np.ndarray) → bytes: arr = np.clip(arr, -1.0, 1.0) return (arr * 32767.0).astype(np.int16).tobytes()

@app.websocket(“/ws/separate”) async def ws_separate(ws: WebSocket): await ws.accept() # ring buffer for input audio input_buffer = np.zeros(0, dtype=np.float32) # output overlap-add state per stem overlap_cache: Dict[str, np.ndarray] = {} seq_in = 0 seq_out = 0

# control: expect a JSON config message first
try:
    init_msg = await ws.receive_text()
    cfg = json.loads(init_msg)
    # cfg could specify sample_rate, stems, etc. Here we assume defaults.
except Exception:
    await ws.close(code=1002)
    return

async def send_chunk(stem: str, seq: int, chunk: np.ndarray):
    header = json.dumps({"stem": stem, "seq": seq, "samples": int(chunk.shape\[0\])})
    await ws.send_text(header)
    await ws.send_bytes(float32_to_int16(chunk))

try:
    while True:
        msg = await ws.receive()
        if msg\["type"\] == "websocket.disconnect":
            break

        if msg\["type"\] == "websocket.receive_bytes":
            # Append incoming PCM16 frame
            pcm_bytes = msg\["bytes"\]
            if len(pcm_bytes) == 0:
                continue
            audio_f32 = int16_to_float32(pcm_bytes)
            input_buffer = np.concatenate(\[input_buffer, audio_f32\])
            seq_in += 1

            # Process as long as we have enough for a window
            while input_buffer.shape\[0\] >= WINDOW_SAMPLES:
                window = input_buffer\[:WINDOW_SAMPLES\]
                input_buffer = input_buffer\[HOP_SAMPLES:\]  # hop forward

                stems = separator.process_window(window)

                # Simple overlap-add using Hann window
                hann = np.hanning(WINDOW_SAMPLES).astype(np.float32)
                for stem_name, stem_audio in stems.items():
                    stem_audio = stem_audio.astype(np.float32)
                    stem_audio \*= hann

                    # Get previous overlap cache
                    prev = overlap_cache.get(stem_name)
                    if prev is None or prev.shape\[0\] != WINDOW_SAMPLES:
                        prev = np.zeros(WINDOW_SAMPLES, dtype=np.float32)

                    # Overlap-add: first half add to prev tail and emit HOP chunk
                    first_half = stem_audio\[:HOP_SAMPLES\] + prev\[HOP_SAMPLES:\]
                    # send chunk
                    await send_chunk(stem_name, seq_out, first_half)
                    # Prepare cache: store current window for next overlap
                    overlap_cache\[stem_name\] = stem_audio
                seq_out += 1

        elif msg\["type"\] == "websocket.receive_text":
            # Could handle control messages, e.g., stop, change stems, etc.
            cmd = json.loads(msg\["text"\])
            if cmd.get("cmd") == "stop":
                break

except WebSocketDisconnect:
    pass
finally:
    if ws.application_state != WebSocketState.DISCONNECTED:
        await ws.close()

```

Run the server: ```bash pip install fastapi uvicorn numpy scipy uvicorn server:app --host 0.0.0.0 --port 8000 ```

To integrate a real separator:

Load your torch model in `StreamingSeparator._init_`.
In `process_window`, run model inference on the window (consider stereo and batching).
Maintain consistent latency by fixing `WINDOW_SAMPLES` and `HOP_SAMPLES`.

For Demucs-like models, use windowed inference with 50% overlap, and soft masks to reduce seams. Keep the model on GPU and avoid reallocation per call.

Client: Python example (streams microphone or WAV to server and plays stems)

This client:

Captures audio frames (or reads from a WAV file).
Sends frames as PCM16 over WebSocket.
Receives back per-stem frames and plays them using a jitter buffer.

You can adapt this to a browser (WebAudio + WebSocket) or mobile. For Python, we’ll use `sounddevice` for capture/playback.

```python

client.py

import asyncio import json import queue import numpy as np import sounddevice as sd import websockets

SAMPLE_RATE = 48000 CHANNELS = 1 FRAME_SAMPLES = 2048 BYTES_PER_SAMPLE = 2

SERVER_URL = “ws://YOUR_SERVER_IP:8000/ws/separate”

def float32_to_int16(arr: np.ndarray) → bytes: arr = np.clip(arr, -1.0, 1.0) return (arr * 32767.0).astype(np.int16).tobytes()

def int16_to_float32(pcm_bytes: bytes) → np.ndarray: return np.frombuffer(pcm_bytes, dtype=np.int16).astype(np.float32) / 32768.0

async def run(): # Queues per stem for playback stem_queues = {“vocals”: queue.Queue(maxsize=100), “other”: queue.Queue(maxsize=100), “mixture”: queue.Queue(maxsize=100)}

# Playback streams per stem (mix or solo as desired)
def playback_callback(outdata, frames, time, status, stem="vocals"):
    try:
        chunk = stem_queues\[stem\].get_nowait()
    except queue.Empty:
        chunk = np.zeros(frames, dtype=np.float32)
    outdata\[:\] = chunk.reshape(-1, 1)

stream_vocals = sd.OutputStream(samplerate=SAMPLE_RATE, channels=1, blocksize=FRAME_SAMPLES,
                                dtype="float32", callback=lambda \*args: playback_callback(\*args, stem="vocals"))
stream_other = sd.OutputStream(samplerate=SAMPLE_RATE, channels=1, blocksize=FRAME_SAMPLES,
                               dtype="float32", callback=lambda \*args: playback_callback(\*args, stem="other"))
# You can start only one stem or mix them locally as you prefer.
stream_vocals.start()
# stream_other.start()

async with websockets.connect(SERVER_URL, max_size=None) as ws:
    # Send init config
    await ws.send(json.dumps({"sample_rate": SAMPLE_RATE, "channels": CHANNELS}))

    # Microphone input stream
    in_stream = sd.InputStream(samplerate=SAMPLE_RATE, channels=1, blocksize=FRAME_SAMPLES, dtype="float32")
    in_stream.start()

    try:
        while True:
            # Read a frame from mic
            indata, \_ = in_stream.read(FRAME_SAMPLES)
            pcm = float32_to_int16(indata.flatten())
            await ws.send(pcm)

            # Receive any available frames (drain loop)
            recv_more = True
            while recv_more:
                try:
                    msg = await asyncio.wait_for(ws.recv(), timeout=0.0)
                except asyncio.TimeoutError:
                    recv_more = False
                    break

                if isinstance(msg, str):
                    header = json.loads(msg)
                    stem = header\["stem"\]
                    samples = header\["samples"\]
                    # next message is the corresponding bytes payload
                    payload = await ws.recv()
                    audio = int16_to_float32(payload)
                    # Optional: basic jitter buffer; if too much data, drop old
                    q = stem_queues.get(stem)
                    if q:
                        if q.full():
                            try: q.get_nowait()
                            except queue.Empty: pass
                        q.put_nowait(audio)
    finally:
        in_stream.stop()
        in_stream.close()
        stream_vocals.stop()
        stream_vocals.close()
        # stream_other.stop(); stream_other.close()

if _name_ == “_main_”: asyncio.run(run()) ```

Replace `YOUR_SERVER_IP` with your server’s IP/hostname.

Integrating a real model (Demucs/MDX) with windowing

Checklist:

Load model once:
- torch.set_grad_enabled(False)
- model.eval().to(device)
Pre-allocate tensors on GPU.
For each `process_window`:
- Convert `float32` NumPy → torch tensor on GPU: `torch.from_numpy(window).to(device)`
- If stereo needed, `window` shape [2, T].
- Compute STFT or feed raw waveform depending on model.
- Get stems; ensure time alignment.
Overlap-add:
- Use Hann or Tukey window.
- Maintain per-stem overlap cache exactly as shown.
Latency tuning:
- Reduce `WINDOW_SAMPLES` to 0.5 s (with 50% hop) for lower latency, at slight quality cost.
- Keep a 100–200 ms client jitter buffer to hide network jitter.

Security, scaling, and ops

Protect the WebSocket endpoint with a token (JWT in the first control message).
Use TLS (wss://) via a reverse proxy (Caddy/Nginx) and systemd to keep the server running.
If you expect multiple concurrent clients, use a lightweight queue per connection and consider batching on GPU if windows line up.

When to choose Opus

If Wi‑Fi bandwidth is constrained, encode frames with Opus at 48 kHz mono 64–96 kbps. Use `pyogg`/`opuslib` on Python client or native Opus in browser. This adds a small codec delay but saves bandwidth substantially.

Next steps

Do you want me to provide:
- A Demucs-based `StreamingSeparator` implementation with PyTorch?
- A browser client (WebAudio + Opus + WebSocket) so phones can use it without Python?
- A Dockerfile and systemd unit to deploy the server on a GPU VM?

Mashedup · November 8, 2025, 11:09am

As you can see I can take the solution concepts further and try but it’s gone way beyond my expertise but for denon developers or even InMusic they would have the resource and expertise to build on the concepts and build solutions for standalone and devices that go into controller mode meaning that features could be implemented that any device that connects to the internet will not be limited by onboard resources.

djliquidice · November 8, 2025, 3:23pm

And it has to be the version downloaded from their website, not the mac store as I learned from someone on this forum last week.

djliquidice · November 8, 2025, 3:26pm

I applaud your effort, but I wonder if all of that AI posting was unnecessary. Team Denon is extremely experienced in so many things within the intersection of music and tech. I’m sure they have their own licenses to AI dev stuff.

crazycraig · November 14, 2025, 9:46pm

Regarding the purgeable memory….. This was brought up when stems was 1st released: Engine desktop not cleaning temp files after stem processing on mac? - #24 by crazycraig and I also ended buying Daisy Disk to help. Its a year later & still nothing… Sadly it seems like DenonDj is getting into a habit of leaving loose ends with their software & simply moving on.

Mashedup · November 20, 2025, 5:33pm

It’s interesting because I bought into Denon and heard only good things about Denon. I still think for the price they are unrivalled for what you get. I have one or two gripes which yeah i found a solution after a lot of painful attempts. The realtime stem separation is one that I hope Denon figure out using some sort of cloud based processing to get around the hardware limits. But hope like you probably do that future updates will introduce some of these features and they won’t just abandon projects.

djliquidice · November 20, 2025, 6:55pm

I give this a 100% no. Why? The Cloud is a fancy term for “other people’s computers”, and I can’t imagine anyone willing to fork the bills for developing (CapEx), and infrastructure, energy and manpower costs (OpEx) for stem separation in the cloud.

The only way a cloud-based solution is possible is if they charge an ongoing fee, which most people would reject.

All of this is best done in the edge (your computing hardware).

In fact, many AI tasks are being migrated to this model due to the costs of ongoing execution of these related tasks.

Mashedup · November 20, 2025, 10:07pm

You say that, but there are people like me that is fully embracing the cloud. I think the investment is worth it as everything is moving to cloud base.

ill use pc gaming as an example in that a lot of people cannot afford the high end graphic cards and processors to play games in the highest settings so they use cloud based servers to handle the heavy lifting.

For stem separation there are already cloud based severs that will be cheap and be able to handle stem separation and much more as we don’t need to the highest tier resources also it’s not a one to one relationship ie one server per person. Many users request could be processed in parallel which means the cost would be a lot lower.

When you talk about cost, denon would not need to purchase or maintain the hardware and can scale up when the user base increases and we will find that it’s not just stem separation but other features can be implemented that mean that even less specced hardware has a longer service life.

Like it or not, cloud based solutions are the future. I have 3.5k max book pro max which is a beast of a machine but one day it will be old and not be able to compete with cloud services. Gaming for example on a Mac is not impressive but accessing GeForce now means that I can run games as this though they were on installed on my Mac running on a top tier graphics card for a fraction of the cost it would be to buy a similar spec gaming pc.

For our needs a cheap cloud service would be perfect. Remember we are not storing all our music on the cloud, instead we are streaming a song up to the cloud which is doing realtime processing like the djay software does directly on the hardware.

The alternative is to connect djay to the usb port and use it as your djay software but for us denon prime users, the screen does not work with. Maybe if that gets fixed, this will be the way to go but it is another device and not the native platform so integration in some places is not as polished as a dedicated djay device or other devices that are certified to work fully with the software.

that could hurt Denon sales as some people will simply switch from Denon to something that integrates perfectly with djay.

To me, if I was Denon, and knowing that this feature is one of the more popular features being able to do mashups, it would be something I would not sleep on.