7. Why are these tools like this?#

Today we’re going to do a bit more practical stuff, but we are also going to start to get into the philosophy of how things are organized.

Understanding the context and principles will help you remember the what and help you understand when you shoudl do things as they have always been done and when you should challenge and change things.

7.1. How do I decide how to resolve a merge conflict?#

First lets make a new branch

git checkout -b examplebranch
Switched to a new branch 'examplebranch'
git status
On branch examplebranch
nothing to commit, working tree clean

Now we will add some text to a file that “tells a story” of what we might have actually done.

echo "edit code near a bug I did not know about" > important_classes.py 

and commit this change

git add .
git commit -m 'new code'
[examplebranch bb4e0d8] new code
 1 file changed, 1 insertion(+)

Then back on the main branch

git checkout main
Switched to branch 'main'
Your branch is ahead of 'origin/main' by 1 commit.
  (use "git push" to publish your local commits)

we will pretend a bug got fixed in that same file. This is a realistic-ish scenario. You might be workign on adding a new feature then get a bug report and fix the bug on a different branch and get it merged into main so that others using the file

echo "fix a bug" >> important_classes.py 
git add .

git commit -m 'bug fix'
[main 4fa9114] bug fix
 1 file changed, 1 insertion(+)
git status
On branch main
Your branch is ahead of 'origin/main' by 2 commits.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean
git merge --help

git checkout examplebranch
Switched to branch 'examplebranch'
git merge main
Auto-merging important_classes.py
CONFLICT (content): Merge conflict in important_classes.py
Automatic merge failed; fix conflicts and then commit the result.
cat important_classes.py 
<<<<<<< HEAD
edit code near a bug I did not know about
=======
fix a bug
>>>>>>> main
nano important_classes.py 
git status
On branch examplebranch
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   important_classes.py

no changes added to commit (use "git add" and/or "git commit -a")

7.2. How do I exit vi?#

vi

If you start it this way, you exit by pressing only q beacuse it will already have the : at the bottom. if you are in ~insert~ mode then you have to press esc first and then : then you may want w to save first before you quit

7.3. Why are we studying developer tools?#

The best way to learn design is to study examples [Schon1984, Petre2016], and some of the best examples of software design come from the tools programmers use in their own work.

Software design by example

7.4. Unix Philosophy#

wiki

  • composability over monolithic design

  • social conventions

The tenets:

  1. Make it easy to write, test, and run programs.

  2. Interactive use instead of batch processing.

  3. Economy and elegance of design due to size constraints (“salvation through suffering”).

  4. Self-supporting system: all Unix software is maintained under Unix.

echo "hello"
hello
echo "hello" > greeting.md

echo $PATH
/Users/brownsarahm/.rbenv/shims:/Library/Frameworks/Python.framework/Versions/3.8/bin:/opt/anaconda3/bin:/opt/anaconda3/condabin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin
award --help
Usage: award [OPTIONS] BADGE_NAME

  award a badge to a student and output the signature as a receipt

  parameters - badge_name : string     the name of the badge
  formatted like type.YYYY-MM-DD for      type in {experience, review,
  practice,community} or type.keyword for type     in {explore, build,
  community}

Options:
  -s, --student-gh-name TEXT
  -p, --gradebook-path TEXT
  --help                      Show this message and exit.
pwd
/Users/brownsarahm/Documents/inclass/systems/github-in-class-brownsarahm-1

7.5. How do we Study (computer) Systems#

When we think of something as a system, we can study it different ways:

  • input/output behavior

  • components

  • abstraction layers

These basic ideas apply whether a computer system or not. We can probe things in different ways.

In a lot of disciplines people are taught one or the other, or they divide professionally into theorists or experimentalists along the lines.

People are the most effective at working with, within, and manipulating systems when they have multiple ways to achieve the same goal.

These are not mutually exclusive we will use them all together, and trade off.

When we study a system we can do so in two main ways. We can look at the input/output behavior of the system as a whole and we can look at the individual components. For each component, we can look at its behavior or the subcomponents. We can take what we know from all fo the components and piece that together. However, for a complex system, we cannot match individual components up to the high level behavior. This is true in both computers and other complex systems. In the first computers in the 1940s, the only things they did was arithmetic and you could match from their components al the way up pretty easily. Modern computers connect to the internet, send signals, load complex graphics, play sounds and many other things that are harder to decompose all at once. Outside of computers, scientists have a pretty good idea of how neurons work and that appears to be the same across mammals and other species (eg squid) but we do not understand how the whole brain of a mammal works, not even smaller mammals with less complex social lives than humans. Understanding the parts is not always enough to understand all of the complex ways the parts can work together. Computers are much less complicated than brains. They were made by brains.

But that fact motivates another way to study a complex system, across levels of abstraction. You can abstract away details and focus on one representation. This can be tied literally to components, but it can also be conceptual. For example, in CSC211 you use a model of stack and heap for memory. It’s useful for understanding programming, but is not exactly what the hardware does. At times, it is even more useful though than understanding exactly what the hardware does. These abstractions also serve a social, practical purpose. In computing, and society at large really, we use standards these are sets of guidelines for how to build things. Like when you use a function, you need to know it’s API and what it is supposed to do in order to use it. The developers could change how it does that without impacting your program, as long as the API is not changed and the high level input/output behavior stays the same.

7.5.1. Behavior#

Try something, see what happens

This is probably how you first learned to use a computer. Maybe a parent showed you how to do a few things, but then you probably tried other things. For most of you, this may have been when you were very young and much less afraid of breaking things. Over time you learned how to do things and what behaviors to expect. You also learned categories of things. Like once you learned one social media app and that others were also social media you then looked for similar features. Maybe you learned one video game had the option to save and expected it in the next one.

Video games and social media are classes or categories of software and each game and app are instances. Similarly, an Integrated Development Environment (IDE) is a category of software and VS Code, … are instances. Also, version control is a category of software and git is an instance. A git host is also a category and GitHub is an instance. Just as before you were worried about details you transferred features from one instance to another within categories, I want you to think about what you know from one IDE and how that would help you learn another. We will study the actual features of IDE a=and what you might want to know about them so that you can choose your own. Becoming a more independent developer you’ll start to have your own opinions about which one is better. Think about about a person in your life who finds computers and technology overall intimidating or frustrating. They likely only use one social media app if at all, or maybe they only know to make documents in Microsoft word and they think that Google Docs is too much to learn, because they didn’t transfer ideas from one to the other.

We have focused on the behavior of individual applications to this point, but there is also the overall behavior of the system in broad terms, typing on the keyboard we expect the characters to show (and when they don’t for example in a shell password, we’re surprised and concerned it is not working).

7.5.2. Components#

Take it apart, asess the pieces

We have the high level parts: keyboard, mouse, monitor/screen, tower/computer. Inside we also tend to know there is a power supply, a motherboard, graphics card, memory, etc.

We can study how each of these parts works while not worrying about the others but having them there. This is probably how you learned to use a mouse. You focused your attention on the mouse and saw what else happened.

Or we can take an individual component and isolate it to study it alone. For a mouse this would be hard. Without a computer attached its output is not very visible. To do this, we would need additional tools to interpret its output and examine it. Most computer components actually would need additional tools, to measure the electrical signals, but we could examine what happens at each part one at a time to then build up what they do.

This idea, however that we can use another tool to understand each component is an important one. This is also a way to again, take care and study each piece even within a software-alone system without worrying about the hardware.

7.5.3. Abstraction#

use a model

As we talked about the behavior and abstraction, we talked some about software and some about hardware. These are the two coarsest ways we can think about a computer system at different levels of abstraction. We can think about it only in physical terms and examine the patterns of electricity flow or we can think about only the software and not worry about the hardware, at a higher level of abstraction.

However, two levels is not really enough to understand how computer systems are designed.

layers

Application - the software you run.

Algorithm - the way that it is implemented, in mathematical level

Programming language - the way that it is implemented for a computer.

Let’s take a simple example, let’s say we are talking about a simple search program that we wrote that finds xx. We can say that you put in a part of a file name and it shows you all the ones with a similar name. That description is at the application level it gives the high level behavior, but not the step by step of how it does it. Let’s say we implemented it using bubble search then searches by … That’s the algorithm level, this is still abstract and could be implemented in different ways, but we know the steps and we can use this to know some things about how fast it will be, what types of result swill make it slower, etc. At the programming level language then we know which language it was done in and we see more details. At this level, we can see the specific data structures and controls structures. These implementation details can also impact performance in terms of space(memory) or time. Still at this level, we do not need to know how the actual hardware works, but we see it in increasing detail. At each level we have different types of operations. At the application we might have input, press enter, get results. At the algorithm we have check the value, compare. At the programming language level we need more specifics too, like assign or append.

After the Programming language level, there is assembly. The advantage to assembly is that it is hardware independent and human readable. It is low level and limited to what the hardware can do, but it is a version of that that can be run on different hardware. It is much lower level. When you compile a program, it is translated to assembly. At this level, programs written in different level become indistinguishable. This has much lower level operations. We can do various calculations, but not things like compare. Things that were one step before, like assign become two, choose a memory location, then write to memory. This level of abstraction is the level of detail we will think about most. We’ll look at the others, but spend much less time below here.

Machine code translates to binary from assembly.

The instruction set architecture is, notice, where the line between software and hardware lives. This is because these are specific to the actual hardware, this is the level where there are different instructions if you have an Intel chip vs an Apple chip. This level reduces down the instructions even more specifically to the specifics things that an individual piece of hardware does and how they dit.

The microarchitecture is the specific circuits: networks of smaller individual components. Again, we can treat the components as blocks and focus on how they work together. At this level we still have calculations like add, multiply, compare, negate, and we can store values and read them. That is all we have at this point though. At this level there are all binary numbers.

The actual gates (components that implement logical operations) and registers (components that hold values) break everything down to logical operations. Instead of adding, we have a series of and, nand, or, and xor put together over individual bits. Instead of numbers, we have registers that store individual zeros or ones. In a modern digital, electrical computer, at this level we have to actually watch the flow of electricity through the circuit and worry about things like the number of gates and whether or not the calculations finish at the same time or having other parts wait so that they are all working together. We will see later that when we try to allow multiple cores to work independently, we have to handle these timing issues at the higher levels as well. However, a register and gate can be implemented in different ways at the device level.

The device (or transistor in modern electrical digital computers) level, is where things transition between analog and digital. The world we live in is actually all analog. We just pay attention to lots of things at a time scale at which they appear to be be digital. Over time devices have changed from mechanical switches to electronic transistors. Material science innovations at the physics level have improved the transistors further over time, allowing them to be smaller and more heat efficient. Because of abstraction, these changes could be plugged into new hardware without having to make any changes at any software levels. However, they do enable improvements at the higher levels.

Note

For example, Bayesian statistics is a philosophy that treats probability as subjective uncertainty instead of as frequency. This has some interpretative differences, but most importantly it means that we always need an extra factor (multiplied term) in our calculations. This makes all of the math much more complex. For many decades Bayesian statistics was not practical for anything but the simplest models. However, with improvements to computers, that opened new options at the algorithm level. The first large scale application of this type of statistics was by Microsoft after their researchers built a Bayesian player model for player matching in Halo.

7.6. Review today’s class#

  1. Read today’s notes when they are posted.

  2. Add to your software.md a section about if that project does or does not adhere to the unix philosophy.

  3. create methods.md and answer the following:

- which of the three methods for studying a system do you use most often when debugging? 
- do you think using a different strategy might help you debug faster sometimes? why or why not? 

7.7. Prepare for Next Class#

  1. install jupyterbook on Mac or linux those instructions will work on your regular terminal, if you have python installed. On Windows those instructions will work in the Anaconda prompt or any other terminal that is set up with python. If these steps do not make sense see the recommendations in the syllabus for more instructions including videos of the Python install process in both Mac and Windows.

  2. If you like to read about things before trying them, skim the jupyterbook docs.

  3. Think about and be prepared to reply to questions in class about your past experiences with documentation, both using it and writing it.

7.8. More Practice#

  1. Read today’s notes when they are posted.

  2. Add to your software.md a section about if that project does or does not adhere to the unix philosophy and why.

  3. create methods.md and answer the following:

- which of the three methods for studying a system do you use most often when debugging? 
- which do you use when you are trying to understand something new? 
- do you think the ones you use most often are consistently the effective? why or why not? When do they work or not work? 
- what are you most interested in trying that might be different?

7.9. Experience Report Evidence#

complete the review or practice badge an link that to your experience report.

Also, update your experiencereport.yml to match the following but with <ta-gh-name> changed to be your group’s TA’s actual gh user name.

name: Experience Report (makeup)
on:
  workflow_dispatch:
    inputs:
      date:
        description: 'missed class date'
        required: true
        type: string


jobs:
  createPullRequest:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Make changes to pull request
      # TODO: modify to create template file
        run: |
          exptitle="experience-${{ inputs.date }}.md"
          cp .templates/experience-report.md experiences/$exptitle
      - name: Create Pull Request
        id: cpr
        uses: peter-evans/create-pull-request@v4
        with:
          token: ${{ secrets.GITHUB_TOKEN }}
          commit-message: initialize experience report
          committer: GitHub <noreply@github.com>
          author: ${{ github.actor }} <${{ github.actor }}@users.noreply.github.com>
          signoff: false
          branch: experience
          branch-suffix: timestamp
          title: 'Experience Report ${{ inputs.date }}'
          body: |
            Checklist:
            - [ ] Merge prepare work into this PR
            - [ ] Link prepare issue to this PR
            - [ ] Complete experience report
            - [ ] Add activity completion evidence per notes
          reviewers: <ta-gh-here>

7.10. Questions After Today’s Class#

7.10.1. Questions that are good explore badge topics#

  • Why did windows deviate so heavily from these principles?

  • Are there new text editors for the terminal? 2015 - present?

  • what is the difference between Unix and Linux?

7.10.2. Does learning a little bit about the other layers in layers of abstractions help create better software or applications?#

I think so, in some cases at least. A lot of the time the other layers are hidden because the abstractions work well. Sometimes though you hit a limit of some sort and knowledge of other layers canhelp you realize that or even overcome it.

7.10.3. Are there options to Nano that we can use, and in what case would it be worth using something else.#

For this class, I recommend nano because it is more friendly. vim and emacs are more popular and powerful, but less friendly.

They can be worth t when you have more complex tasks to do

7.10.4. When is it better to work in the terminal as opposed to GitHub desktop?#

Once you get used to the terminal, it saves a lot of time over using the GUI. Also, you can have better ergonomics using only the keyboard and less mouse use.

The biggest limitation of GitHub desktop is that it only works for GitHub repositories. The command line git program can be used with different hosts.

Also, on the terminal you can automate and use other commands in combination with your git actions.

7.10.5. Is assembly language inefficient to be programming in?#

It is maybe a less efficient use of your time, but otherwise not strictly speaking. However, to get good performance from your code, you have to manually make it efficient. In a compiled languge, when you compile it, it can be optimized for you.

7.10.6. Are there different langauges for machine code?#

Its not in a language per se, there are different kinds and different specifications, but none of it is human readable like the programming languages you have used to date.

7.10.7. What does assembly code look like?#

We will see more soon but it consists of simple, low level instructions and addresses to apply them to.