17. Linking and Object file binary representation#
17.1. Why is the linking step important?#
Let’s make an empty folder to work with. I made my in my inclass/sytems
folder.
mkdir cexample
and then go in that new folder
cd cexample/
We are going to make two files, first
nano main.c
and put this content in the file:
/* Used to illustrate separate compilation.
Created: Joe Zachary, October 22, 1992
Modified:
*/
#include <stdio.h>
void main () {
int n;
printf("Please enter a small positive integer: ");
scanf("%d", &n);
printf("The sum of the first n integers is %d\n", sum(n));
printf("The product of the first n integers is %d\n", product(n));
}
And then make a second file:
nano help.c
with the contents that describes two functions:
/* Used to illustrate separate compilation
Created: Joe Zachary, October 22, 1992
Modified:
*/
/* Requires that "n" be positive. Returns the sum of the
first "n" integers. */
int sum (int n) {
int i;
int total = 0;
for (i = 1; i <= n; i++)
total += i;
return(total);
}
/* Requires that "n" be positive. Returns the product of the
first "n" integers. */
int product (int n) {
int i;
int total = 1;
for (i = 1; i <= n; i++)
total *= i;
return(total);
}
First we can make the two objects:
gcc -Wall -g -c main.c
but here we get an error:
main.c:8:1: warning: return type of 'main' is not 'int' [-Wmain-return-type]
void main () {
^
main.c:8:1: note: change return type to 'int'
void main () {
^~~~
int
main.c:12:52: error: implicit declaration of function 'sum' is invalid in C99
[-Werror,-Wimplicit-function-declaration]
printf("The sum of the first n integers is %d\n", sum(n));
^
main.c:13:56: error: implicit declaration of function 'product' is invalid in C99
[-Werror,-Wimplicit-function-declaration]
printf("The product of the first n integers is %d\n", product(n));
^
1 warning and 2 errors generated.
We can get around this, by telling main about the functions by adding
int sum(int n);
int product (int n);
to the main.c
nano main.c
so that it is like this:
/* Used to illustrate separate compilation.
Created: Joe Zachary, October 22, 1992
Modified:
*/
#include <stdio.h>
int sum(int n);
int product(int n);
void main () {
int n;
printf("Please enter a small positive integer: ");
scanf("%d", &n);
printf("The sum of the first n integers is %d\n", sum(n));
printf("The product of the first n integers is %d\n", product(n));
}
Now we can make this object file
gcc -Wall -g -c main.c
main.c:10:1: warning: return type of 'main' is not 'int' [-Wmain-return-type]
void main () {
^
main.c:10:1: note: change return type to 'int'
void main () {
^~~~
int
1 warning generated.
we get a warning, but that’s okay
ls
help.c main.c main.o
If we try to make this an executable, we cannot because then it looks for what those functions are supposed to do, but it cannot do that yet.
gcc -o demo main.o -lm
Undefined symbols for architecture x86_64:
"_product", referenced from:
_main in main.o
"_sum", referenced from:
_main in main.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
So, we make the object file for the helper functions
gcc -Wall -g -c help.c
Now, we can finally link and make it executble:
gcc -o demo main.o help.o -lm
and we can run our program
./demo
Please enter a small positive integer: 5
The sum of the first n integers is 15
The product of the first n integers is 120
Tip
One reason we split code is to make it readable, but another reason is what we just did. We can compile each file separately, when your code is large and compiling takes a long time, splitting it will mean you only have to recompile the file(s) you have recently changed and relink, instead of recompiling everything.
If we edit,
nano main.c
to look like the following:
/* Used to illustrate separate compilation.
Created: Joe Zachary, October 22, 1992
Modified:
*/
#include <stdio.h>
int sum(int n);
int product(int n);
void main () {
int n;
printf("Hello! Please enter a small positive integer: ");
scanf("%d", &n);
printf("The sum of the first n integers is %d\n", sum(n));
printf("The product of the first n integers is %d\n", product(n));
}
We added an !
so it is just different.
If we run again, it is not different.
./demo
Please enter a small positive integer: 7
The sum of the first n integers is 28
The product of the first n integers is 5040
Now, we need to recompile the main file.
gcc -Wall -g -c main.c
main.c:10:1: warning: return type of 'main' is not 'int' [-Wmain-return-type]
void main () {
^
main.c:10:1: note: change return type to 'int'
void main () {
^~~~
int
1 warning generated.
and then relink them into an executable
gcc -o demo main.o help.o -lm
for the changes to apply
./demo
Hello! Please enter a small positive integer: 8
The sum of the first n integers is 36
The product of the first n integers is 40320
We see that this makes the executble and the two object files
ls -la
total 136
drwxr-xr-x 7 brownsarahm staff 224 Nov 7 12:53 .
drwxr-xr-x 9 brownsarahm staff 288 Nov 7 12:37 ..
-rwxr-xr-x 1 brownsarahm staff 50072 Nov 7 12:53 demo
-rw-r--r-- 1 brownsarahm staff 474 Nov 7 12:40 help.c
-rw-r--r-- 1 brownsarahm staff 2364 Nov 7 12:49 help.o
-rw-r--r-- 1 brownsarahm staff 383 Nov 7 12:52 main.c
-rw-r--r-- 1 brownsarahm staff 2400 Nov 7 12:53 main.o
We can change the name of the executable with the -o
option.
gcc -o sumProd main.o help.o -lm
./sumProd
Hello! Please enter a small positive integer: 8
The sum of the first n integers is 36
The product of the first n integers is 40320
cat help.o
????8 ??X?
__text__TEXT~X?__debug_str__DWARF~??__debug_abbrev__DWARFCZ?__debug_info__DWARF???__apple_names__DWARF_X?__apple_objc__DWARF?$__apple_namespac__DWARF?$3__apple_types__DWARF?GW__compact_unwind__LDH@?__eh_frame__TEXT?h?
h__debug_line__DWARF??H 2
,
PUH??}??E??E??E?;E???E?E?E?E????E???????E?]?UH??}??E??E??E?;E???E??E?E?E????E???????E?]?Apple clang version 12.0.0 (clang-1200.0.32.2)help.c/Library/Developer/CommandLineTools/SDKs/MacOSX.sdkMacOSX.sdk/Users/brownsarahm/Documents/inclass/systems/cexamplesumproductintnitotal%?|?:
;
'I?:
;
I4:
;
I$>
?
/6ju~=V?
??|?
??x?
??@>V???|???x???t???HSAH
???????
F?7?8H?2?vHSAH
????HSAH
????HSAH
0??
4??$=@>zRx
help.c????>A?C ?$X???????=A?C
v ut<<
g <e? Z<_
v ut<<
g <s? Z<w3& +@
_product_sum```
```{code-cell} bash
:tags: ["skip-execution"]
cat ~/.bash_profile
alias pip=pip3
alias python=python3
export PS1="\u@\W $ "
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/Users/brownsarahm/anaconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/Users/brownsarahm/anaconda3/etc/profile.d/conda.sh" ]; then
. "/Users/brownsarahm/anaconda3/etc/profile.d/conda.sh"
else
export PATH="/Users/brownsarahm/anaconda3/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
python
Python 3.11.4 (main, Jul 5 2023, 09:00:44) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 4+5
9
>>> name = 'sarah'
>>> name
'sarah'
>>> help(ord)
>>> ord('s')
115
>>> [ord(char) for char in name]
[115, 97, 114, 97, 104]
>>> [ord(char) for char in 'denno']
[100, 101, 110, 110, 111]
>>> bytearry(35)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'bytearry' is not defined. Did you mean: 'bytearray'?
>>> bytearray(35)
bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
>>> bytearray(37)
bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
>>> bytearray(34)
bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
>>> type(bytearray(34))
<class 'bytearray'>
>>> type(34)
<class 'int'>
>>> type(print)
<class 'builtin_function_or_method'>
>>> type(ord)
<class 'builtin_function_or_method'>
>>> type(ord('s'))
<class 'int'>
>>> bytes(bytearray(34))
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> bytes(bytearray(3))
b'\x00\x00\x00'
>>> bytes(bytearray(2))
b'\x00\x00'
>>> name_ints = [ord(char) for char in name]
>>> with open('name.txt','wb') as f:
... f.write(bytes(bytearray(name_ints)))
...
5
>>> with open('name.txt','wb') as f:
... f.write(bytes(bytearray(name_ints)))
...
5
>>> exit()
ls
demo help.o main.o sumProd
help.c main.c name.txt
cat name.txt
sarah```
```{code-cell} bash
:tags: ["skip-execution"]
python
Python 3.11.4 (main, Jul 5 2023, 09:00:44) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> name = 'sarah'
>>> name_ints = [ord(char) for char in name]
>>> name_ints = [ord(char)+5 for char in name]
>>> with open('name.txt','wb') as f:
... f.write(bytes(bytearray(name_ints)))
...
5
>>> exit()
cat name.txt
xfwfm```
```{code-cell} bash
:tags: ["skip-execution"]
python
Python 3.11.4 (main, Jul 5 2023, 09:00:44) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> name_ints = [ord(char)-100 for char in name]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'name' is not defined
>>> name = 'sarah'
>>> name_ints = [ord(char)+200 for char in name]
>>> with open('name.txt','wb') as f:
... f.write(bytes(bytearray(name_ints)))
...
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
ValueError: byte must be in range(0, 256)
>>> with open ('name.txt', 'rb') as f:
... content = f.read()
...
>>> content
b''
>>> name_ints = [ord(char)-32 for char in name]
>>> with open('name.txt','wb') as f:
... f.write(bytes(bytearray(name_ints)))
...
5
>>> with open ('name.txt', 'rb') as f:
... content = f.read()
...
>>> content
b'SARAH'
>>> type(content)
<class 'bytes'>
>>> bin(115)
'0b1110011'
>>> char(d)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'char' is not defined. Did you mean: 'chr'?
>>> int('4')
4
>>> a = '4'
>>> type(a)
<class 'str'>
>>> type(int(a))
<class 'int'>
>>> msg = [68, 114, 46, 32, 66, 114, 111, 119, 110,10]
>>> bytes(bytearray(msg))
b'Dr. Brown\n'
>>> exit()
17.2. Prepare for Next Class#
Review your IDE notes (or complete them if you have not yet) ide_thoughts.
Think about what features or extensions in your favorite IDE you like the most and that you think others may not know about.
17.3. Review today’s class#
Create some variations of the
hello.c
we made in class. Make hello2.c print twice with 2 print commands. Make hello5.c print 5 times with a for loop and hello7.c print 7 times with a for loop. Build them all on the command line and make sure they run correctly.Write a bash script, assembly.sh to compile each program to assembly and print the number of lines in each file.
Put the output of your script in hello_assembly_compare.md. Add to the file some notes on how they are similar or different based on your own reading of them.
17.4. More Practice#
On Seawulf, modfiy main.c from class to accept the integer as a command line argument instead of via input while running the program. See this tutorial for an example.
5. Write a bash script demo_test.sh that runs your compiled program for each integer from 10 to 30 (syntax for a range is {start..end}
so this would be {10..30}
)
6. Write an sbatch script, batchrun.sh to run your script on a compute node and save the output to a file. The sbatch script should compile and link the program and then call the script. see the options
7. use scp to download your modified main, script files, and output to your local computer and include them in your kwl repo.
17.5. Experience Report Evidence#
17.6. Questions After Today’s Class#
17.6.1. Why didn’t python work locally for me?#
On Windows, you may need to reinstall GitBash with different settings.
17.6.2. Can I get a bunch of binary representations of letters and use it to write to a file in characters?#
yes
17.6.3. Are we implementing more function/loops in the future?#
More can be done, but not a lot more in class time
17.6.4. What would be a case where we need to use bytes conversion in our code?#
If you need to compress data to work in a low memory environment.
17.6.5. Why would I want to write and compile code from a Terminal and not from and IDE?#
If you are working on a server where you do not have an IDE available. Or if you are doing something really quick.
17.6.6. Is it ok that I see blobs in the object file, not just od characters?#
Yes! Different computers will make different amounts of progress at this
17.6.7. what else can i do with c / python in the terminal#
You can do anything the langague can do through a terminal. We had programming before we had modern IDEs.
17.6.8. Does the order of the compiling command matter?#
This is a practice badge question.
17.6.9. What benefit is it to create python scripts on the server?#
If it is processing large amount of data or if it has complex calculations that are intensive.
17.6.10. Is there a way to view a text file in raw bytes with a bash command?#
the terminal cannot show it as the raw bytes, but we will see some ways to view the file in different representations.
17.6.11. Is it better to use python in the terminal?#
it is not better, it is just another option. Having more options also helps you see the how it really works.
17.6.12. Most people will tell you python is extremely slow, so why are we still using it?#
Python is slower than C, but it is much easier to read than fast lanaguages. So it is often faster to write. It also has more effecitve, well written libraries and consistent performance. It is essentially the only option for modern effecitve AI models unless you are writing it all from scratch with is generally intractable.
17.7. Good Explore badge questions#
how would linking a third file to the executable affect how it runs, if it has nothing to do with the res to the code