From a9b7e06209c70cb99ab313145d0aa13084260b81 Mon Sep 17 00:00:00 2001
From: Robert Grancsa <robiku1975@gmail.com>
Date: Sun, 9 Mar 2025 18:06:35 +0200
Subject: [PATCH] Added first part of homework

Signed-off-by: Robert Grancsa <robiku1975@gmail.com>
---
 README.md | 110 ++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 103 insertions(+), 7 deletions(-)

diff --git a/README.md b/README.md
index 3937d0e..d741cac 100644
--- a/README.md
+++ b/README.md
@@ -1,9 +1,105 @@
-# Perfect assignment
+# Homework 1 
 
-Write a program that given a number as input argument prints the corespondig number of 1s on standard output.
+## Assignment: Simple C-to-Assembly Compiler
 
-E.g:
-```bash
-$> ./binary 3
-1 1 1
-```
+### Objective
+
+Develop a small compiler or translator that converts simple C-like code snippets into basic assembly instructions. The primary goal is to familiarize yourself with assembly language mnemonics and to understand how high-level constructs map to low-level operations. This assignment is intentionally minimalistic to ease you into both assembly language and compiler design. If we actually want to be pedantic, this is actually a [transpiler](https://en.wikipedia.org/wiki/Source-to-source_compiler) implementation.
+
+### Theme
+
+_Simple C Statements → Real Assembly_
+The translation should be as simple as possible while covering basic arithmetic operations, register usage, data movement, and control flow constructs. On an lower level, this is what happens in the background when compiling a C program, but it is simplified for easier implementations. If you are really intersted about how a compiler works in reality, we recommend you the 4th year course of [Compilers](https://gitlab.cs.pub.ro/Compilatoare)
+
+## Conventions and Guidelines
+
+Because we want to prevent you from having to juggle with registers and 
+
+- **Basic Register Mapping:**
+  - `A` → `eax`
+  - `B` → `ebx`
+  - `C` → `ecx`
+  - `D` → `edx`
+  - For vector (array) access, use `esi` (or `edi`), e.g., if `v = [1, 2, 3]`, the base address can be represented by a label like `$vec`.
+- **Array Notation:**
+  - Declaring an array: `v = [1, 2, 3]`  
+  - Accessing an element: `c = v[1]` should translate into something like:
+    ```
+    MOV esi, $vec    ; Load the base address of the vector
+    MOV ecx, [esi + 4] ; Load the second element (assuming 4 bytes per element)
+    ```
+  - When 
+- **Data types**
+  - We'll assume all of the data types are **4 bytes**
+  - When you see a number, we will treat it as a int (4 bytes)
+  - When handling pointers, those will also be stored as 4 bytes
+
+## Instructions
+
+### MOV
+
+The mov instructions is the simplest of all, and as the name says, it moves the data from one place to another. More details here
+
+#### Usage
+
+`MOV destination, source`
+
+#### C - ASM translation
+
+| **C Code** 	| **ASM Code**   	|
+|------------	|----------------	|
+| `a = 1;`   	| `MOV eax, 1`   	|
+| `b = a;`   	| `MOV ebx, eax` 	|
+
+**Note**: There will be no `MOV 2, eax` as that is an invalid operation
+
+### Logical operations
+
+#### Usage
+
+`AND destination, source`
+`OR destination, source`
+`XOR destination, source`
+
+#### C - ASM translation
+
+| **C Code** 	    | **ASM Code**   	|
+|------------	    |----------------	|
+| `a = a & 0xFF;` | `AND eax, 0xFF` |
+| `b = a | b;`   	| `OR ebx, eax`   |
+| `c = a ^ c;`    | `XOR ecx, eax`  |
+
+### Arithmetic operations
+
+There are 4 aritmetic operations, but we will take them 2 by two, as there are differences between them. The first two, add and sub, are for + and - operations. These work the same way as the MOV instructions.
+
+#### Usage - add, sub
+
+`SUB destination, source`
+
+#### C - ASM translation
+
+| **C Code** 	    | **ASM Code**   	|
+|------------   	|----------------	|
+| `a = a + 5;`   	| `ADD eax, 5`   	|
+| `b = b - a;`   	| `SUB ebx, eax` 	|
+
+#### Usage - mul, div
+
+`MUL source2`
+`DIV divisor`
+
+You might be wondering, why only one operand? Here we have a special rule, and goes something like this:
+
+When multiplying with MUL, the multiplication will actually take EAX and source2 as the multiplication values, and set them in 2 registers -> EDX and EAX. EDX will store the higher values, and EAX the lower one, acting as a big 8 byte register, because after multiplying two 32 bit number, we might have an overflow. Looking at the table will make it a bit clearer.
+
+| **C Code** 	    | **ASM Code**   	|
+|------------   	|----------------	|
+| `a = a * 3;`   	| `MUL 3`   	    |
+| `b = b * c;`      | `MOV eax, ebx`    |
+|                   | `MUL ecx`         |
+|                   | `MOV ebx, eax`    |
+
+The last 3 rows represent the second operation.
+
+**Note**: For simplicity, we will never use eax as source2, and we will never never use the value from EDX.
-- 
GitLab