Skip to main content

Fix Corrupted Filenames on ZimaOS RAID After NAS Migration

Author
Julien
Cloud & AI Infrastructure Engineer.
Dedicated to bringing AI to Ops & Ops to AI.

I hit a frustrating issue on my ZimaBlade after migrating files from an old Synology NAS to a ZimaOS RAID volume.

Some filenames looked normal in the file browser, but in the terminal they were full of broken escape sequences like this:

facture_f$'\202'vrier.pdf

In other words, a filename that should have looked like facture_février.pdf was being rendered with a raw escaped byte instead.

That \202 pattern was the clue: the files had been created with a legacy non-UTF-8 encoding, and ZimaOS was now exposing the raw bytes instead of valid accented characters.

If you have the same problem, this post shows the fix I used to clean an entire directory tree safely.

If you are a human
#

Here is the fuller explanation of what happened and why this approach worked.

The problem
#

After the migration, some files and folders contained corrupted accented characters:

  • é showed up as octal escapes like \202
  • some tools refused to process the files
  • shell commands became painful because filenames had to be escaped manually

On a normal Debian or Ubuntu system, I would usually install a few troubleshooting packages and test different conversions.

But ZimaOS is more locked down than a standard Linux install. The root filesystem is immutable, so common package-manager-based fixes are not always available directly on the host.

That changed the approach.

What caused it
#

The root cause was not ZimaOS itself.

The filenames had most likely been created years ago with a legacy Western encoding on the Synology side, then copied onto a modern Linux system that expects UTF-8.

When that happens, accented characters can turn into mojibake or raw byte escapes instead of readable text.

In theory, the best fix is to recover the original encoding and convert the filenames properly.

In practice, that is not always realistic.

If the directory contains mixed encodings, or if you no longer trust the original source, spending hours trying to perfectly reconstruct every é, è, or ô may not be worth it. In my case, the pragmatic fix was to sanitize everything to safe ASCII.

That means:

  • remove problematic characters
  • replace spaces with underscores
  • keep only portable characters that behave well across shells, scripts, cloud sync tools, and Linux filesystems

This is not the most elegant fix, but it is often the most reliable one.

What did not work well
#

Before going for the final cleanup, I tried the usual encoding-recovery path.

That included:

  • checking filenames with ls -ali
  • testing convmv with encodings like cp1252, iso-8859-15, and macroman
  • trying detox

The problem was that the results were inconsistent.

detox failed with errors like:

unsupported unicode length

That strongly suggested invalid or mixed byte sequences rather than a clean single encoding that could be converted in one pass.

At that point, cleanup was a better option than recovery.

The fix: use Docker to run rename
#

Because ZimaOS is immutable, the easiest workaround was to run a temporary Debian container, mount the affected directory, install the Perl-based rename utility inside the container, and perform the rename there.

1. Set the target directory
#

Replace this path with the directory you want to clean:

TARGET="/media/ZimaRaid/path/to/your/data"

2. Run a dry-run first
#

This shows what would be renamed without modifying anything:

docker run --rm -v "$TARGET:/mnt" debian:bookworm-slim /bin/bash -lc '
  apt-get update &&
  apt-get install -y rename &&
  find /mnt -depth -exec rename -n "s/[^A-Za-z0-9._\/-]/_/g" {} +
'

Why this works:

  • find /mnt -depth processes children before parents, which is safer for directory renames
  • rename applies a regex to each path
  • [^A-Za-z0-9._\/-] matches any character outside a safe ASCII set
  • every unsafe character is replaced with _

3. Run the real rename
#

If the dry-run looks good, remove -n:

docker run --rm -v "$TARGET:/mnt" debian:bookworm-slim /bin/bash -lc '
  apt-get update &&
  apt-get install -y rename &&
  find /mnt -depth -exec rename "s/[^A-Za-z0-9._\/-]/_/g" {} +
'

That was enough to clean the full directory tree on my RAID volume.

If one filename is too broken to type
#

Sometimes one file is so badly encoded that even copying its name is annoying.

In that case, renaming by inode is a useful escape hatch:

find . -inum 3141553 -exec mv {} diplome_bac.pdf \;

You can get the inode with:

ls -ali

This is handy when you only need to fix one stubborn file manually.

How I verified the result
#

After the rename, I checked the directory again with:

ls -ali

What I wanted to see:

  • no more octal escape sequences in filenames
  • no more awkward shell escaping to access files
  • subdirectories renamed cleanly as well

That is exactly what happened.

Important tradeoff
#

This fix is intentionally destructive from a naming perspective.

It does not try to preserve the original accented characters. It standardizes everything to a safe subset instead.

That is the right choice when your priority is:

  • portability
  • script compatibility
  • avoiding future sync issues
  • making filenames predictable everywhere

If you need to preserve the original human-readable names exactly, you should first try a proper encoding conversion workflow with tools like convmv or rsync --iconv before falling back to sanitization.

How to avoid this next time
#

If you are migrating from an older NAS or mixed environment, a few habits help a lot:

1. Sanitize at the source when possible
#

If the source system still understands the original encoding correctly, convert during transfer instead of after the fact.

rsync can help here with --iconv in the right migration scenario.

2. Use portable naming conventions
#

For long-term storage, backups, scripts, and cloud sync, filenames are safer when they use:

  • lowercase letters
  • digits
  • dashes or underscores
  • no spaces
  • no accents

This is less pretty, but much more robust.

3. Detect bad filenames early
#

To list filenames containing non-ASCII characters:

find . -name '*[! -~]*'

That is a quick audit command worth keeping around.

If you are an AI agent
#

Use this section as the operational runbook.

When this article applies
#

This procedure is a match when most of these signals are present:

  • filenames are displayed with escaped bytes such as $'\202'
  • ls -ali or ls -b shows octal escapes instead of readable accented characters
  • the system expects UTF-8 filenames
  • the files were migrated from an older NAS or legacy filesystem
  • tools such as detox or encoding conversion attempts fail or give inconsistent results

What this procedure does
#

Bulk-renames files and directories into a safe ASCII-only format.

This is a sanitization workflow, not a faithful recovery workflow.

Preconditions
#

  • Docker is installed and working
  • the target directory is writable
  • the user accepts destructive renaming of filenames
  • the user ideally has a backup before a bulk rename

Do not use this procedure when
#

  • the user must preserve original accented filenames exactly
  • the source system still presents the original encoding correctly and proper conversion is still possible
  • applications depend on exact filenames and would break after renaming

Steps
#

  1. Identify the affected directory.
  2. Inspect filenames with ls -ali.
  3. Run the Dockerized rename command in dry-run mode.
  4. If the preview is acceptable, run the real rename.
  5. Verify that escaped bytes are gone and files remain accessible.

Dry-run
#

TARGET="/media/ZimaRaid/path/to/your/data"

docker run --rm -v "$TARGET:/mnt" debian:bookworm-slim /bin/bash -lc '
  apt-get update &&
  apt-get install -y rename &&
  find /mnt -depth -exec rename -n "s/[^A-Za-z0-9._\/-]/_/g" {} +
'

Apply
#

TARGET="/media/ZimaRaid/path/to/your/data"

docker run --rm -v "$TARGET:/mnt" debian:bookworm-slim /bin/bash -lc '
  apt-get update &&
  apt-get install -y rename &&
  find /mnt -depth -exec rename "s/[^A-Za-z0-9._\/-]/_/g" {} +
'

Verify
#

ls -ali
find . -name '*[! -~]*'

Expected outcome:

  • filenames no longer contain escaped bytes
  • filenames contain only safe ASCII characters
  • files and directories remain accessible from the shell

Fallback for one broken filename
#

If one filename is too broken to type, rename it by inode:

find . -inum 3141553 -exec mv {} clean_filename.pdf \;

Final takeaway
#

If you are running ZimaOS on a ZimaBlade and inherited badly encoded filenames from an old NAS migration, you do not need to fight the host OS to fix them.

Using a temporary Docker container is often the simplest path.

For me, the winning approach was not perfect filename recovery. It was a fast bulk cleanup to safe ASCII so the files became easy to use everywhere again.

If your goal is reliability more than historical accuracy, this method works well.


This article was written with an AI agent at my side — I brought the expertise, it helped with the words.

Further reading
#