# Preserve code fences & markdown syntax by not translating them def protect_blocks(txt): # replace code fences with placeholders blocks = {} def repl(m): key = f"__CODE_len(blocks)__" blocks[key] = m.group(0) return key txt = re.sub(r'```[\s\S]*?```', repl, txt) # triple backticks txt = re.sub(r'`[^`\n]+`', repl, txt) # inline code return txt, blocks
(“ktab my system mtrjm llrbyt pdf” → “Write my system, translate it to Arabic, and export as PDF”) 1️⃣ What This Guide Covers | Step | What you’ll achieve | |------|---------------------| | 1. Gather & organize source material | All the text, diagrams, code snippets, etc. that describe your system. | | 2. Choose a translation workflow | Machine‑only, human‑only, or hybrid (machine + post‑editing). | | 3. Prepare the document for RTL (right‑to‑left) layout | Use a format that supports Arabic styling (Markdown → Pandoc, LaTeX, Word, etc.). | | 4. Translate the content | Tools, glossaries, and best‑practice tips. | | 5. Polish the Arabic version | Font selection, justification, headings, tables, figures, and code blocks. | | 6. Convert to PDF | One‑click export or command‑line pipeline. | | 7. Quality‑check (QC) & distribute | Proofread, test accessibility, and share the final PDF. | 2️⃣ Step‑by‑Step Workflow 2.1 Gather & Structure Your Source Docs | Action | Tips | |--------|------| | Collect everything in one folder (e.g., system-doc/ ). | Keep subfolders: src/ , images/ , code/ , draft/ . | | Use a markup language (Markdown .md , reStructuredText .rst , or LaTeX .tex ). | These formats are easy to convert and keep formatting separate from content. | | Create a table of contents (TOC) early – it will be auto‑generated later. | Example in Markdown: # Table of Contents → [[TOC]] (Pandoc) or \tableofcontents (LaTeX). | | Mark up special blocks (code, tables, notes). | for code fences, | for tables, >! for warnings, etc. | Result: A clean, version‑controlled source (Git repo) that can be re‑used for multiple languages. 2.2 Choose a Translation Strategy | Strategy | When to use | Pros | Cons | |----------|-------------|------|------| | Pure Machine Translation (MT) | Quick drafts, low‑stakes docs. | Fast, cheap. | May mis‑translate technical terms. | | Human Translation | High‑quality manuals, legal/medical content. | Accurate, consistent terminology. | Time‑consuming, higher cost. | | Hybrid (MT + Post‑Editing) | Medium‑budget projects needing decent quality. | Faster than pure human, better than raw MT. | Requires a skilled editor. |
% Code blocks (left‑to‑right) \usepackagelistings \lstset basicstyle=\ttfamily\small, language=, breaklines=true, frame=single, numbers=left, numberstyle=\tiny, xleftmargin=0.5cm, xrightmargin=0.5cm, columns=fullflexible, keepspaces=true, escapeinside=(*@@*)
dst.write_text(translated, encoding='utf-8') print('✅ Translation saved to', dst) PY Run the script on sections (e.g., one chapter at a time) to avoid hitting API limits and to make post‑editing easier. 2.4.2 Post‑Editing Checklist | Item | What to look for | |------|------------------| | Technical terminology | Verify against your glossary. | | Numbers & units | Keep Arabic numerals ( ١٢٣ ) or Western ( 123 ) consistently (choose one). | | Directionality | Ensure bullet lists, tables, and headings flow RTL. | | Code snippets | Keep them as‑is (English) – wrap them in a left‑to‑right block. | | Figures & screenshots | Add Arabic captions ( \caption... ) and, if needed, mirror UI screenshots. | 2.5 Polish the Arabic Document 2.5.1 Create a Pandoc Template (Arabic‑Ready) Save this as templates/arabic.tex : ktab my system mtrjm llrbyt pdf
Write a tiny Bash script that runs pdftotext (poppler) on the PDF and greps for Arabic characters to ensure they are present and not image‑only.
\endRTL \enddocument pandoc \ --from markdown+yaml_metadata_block \ --template=templates/arabic.tex \ --pdf-engine=xelatex \ --toc \ --metadata title="دليل نظام XYZ" \ --output output/system_xyz_ar.pdf \ draft/system_ar.md Explanation of flags
# macOS (Homebrew) brew install pandoc brew install --cask mactex-no-gui # includes XeLaTeX # Preserve code fences & markdown syntax by
Optional: Add a ( openssl ) if the document must be tamper‑proof.
pdftotext output/system_xyz_ar.pdf - | grep -q "[\x0600-\x06FF]" && echo "✅ Arabic text detected" | Channel | Recommended format | |---------|--------------------| | Email / intranet | PDF (max 10 MB, compressed). | | Web download | PDF + an HTML version (run pandoc -t html5 ). | | Printed manual | Use the same PDF; print with a printer that supports RTL (most modern printers do). |
\documentclass[12pt]article \usepackagefontspec \usepackagepolyglossia \setmainlanguagearabic \setotherlanguageenglish \newfontfamily\arabicfont[Script=Arabic]Noto Sans Arabic \newfontfamily\englishfontLatin Modern Roman Prepare the document for RTL (right‑to‑left) layout |
translator = DeepLTranslator(api_key='YOUR_DEEPL_API_KEY', source='EN', target='AR') text = src.read_text(encoding='utf-8')
| Font | Why | |------|-----| | (Google) | Clean, open‑source, covers all Unicode Arabic ranges. | | Amiri | Classic book‑style, great for printed manuals. | | Scheherazade | Good for body text with nice ligatures. | 2.4 Translate the Content 2.4.1 Using a CLI MT Engine (DeepL Example) # Install deep-translator Python package pip install deep-translator
openssl dgst -sha256 -sign private_key.pem -out system_xyz_ar.pdf.sig output/system_xyz_ar.pdf | Command | What it does | |---------|--------------| | git init && git add . && git commit -m "init" | Version‑control the source docs. | | pandoc -s src.md -o src.pdf | Test a simple PDF export (English). | | python translate.py | Run the MT + placeholder script (see 2.4.1). | | pandoc --template=arabic.tex --pdf-engine=xelatex -o final.pdf arabic.md | Produce the final Arabic PDF. | | pdftotext final.pdf - | grep -q "[\x0600-\x06FF]" && echo OK | Verify Arabic text is real, not an image. | 4️⃣ Common Pitfalls & How to Avoid Them | Pitfall | Fix | |---------|-----| | Arabic letters appear disconnected | Use a Unicode‑aware engine (XeLaTeX) and a proper Arabic font (Noto Sans Arabic). |
# Simple script to translate a Markdown file python - <<'PY' from deep_translator import DeepLTranslator import pathlib, sys, re
\newenvironmentRTL\beginR\endR \begindocument \beginRTL $if(title)$ \section*$title$ $endif$