14. How do I work on a remote Server?#

14.1. Using SSH#

ssh -l brownsarahm seawulf.uri.edu

Your default login is your uri username (part of your URI e-mail before the @ ) and default password is your URI ID.

brownsarahm@seawulf.uri.edu's password:
You are required to change your password immediately (root enforced)
Last login: Wed Oct 26 16:44:26 2022 from 172.20.255.15
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for user brownsarahm.
Changing password for brownsarahm.
(current) UNIX password:
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
Connection to seawulf.uri.edu closed.

You have to re-enter your ID, then type your new password twice.

Then you can re-login with your new password

ssh -l brownsarahm seawulf.uri.edu
brownsarahm@seawulf.uri.edu's password:
Last login: Wed Oct 26 16:47:05 2022 from 172.20.255.15

This is regular bash, so all fo the commands we have seen so far work here

[brownsarahm@seawulf ~]$ ls
bash-lesson.tar.gz                           SRR307025_1.fastq
demo.sh                                      SRR307025_2.fastq
dmel-all-r6.19.gtf                           SRR307026_1.fastq
dmel_unique_protein_isoforms_fb_2016_01.tsv  SRR307026_2.fastq
gene_association.fb                          SRR307027_1.fastq
linecounts.txt                               SRR307027_2.fastq
my_job.sh                                    SRR307028_1.fastq
results                                      SRR307028_2.fastq
slurm-23950.out                              SRR307029_1.fastq
SRR307023_1.fastq                            SRR307029_2.fastq
SRR307023_2.fastq                            SRR307030_1.fastq
SRR307024_1.fastq                            SRR307030_2.fastq
SRR307024_2.fastq
[brownsarahm@seawulf ~]$ pwd
/home/brownsarahm
[brownsarahm@seawulf ~]$ whoami
brownsarahm

14.2. Downloading files#

[brownsarahm@seawulf ~]$ wget http://www.hpc-carpentry.org/hpc-shell/files/bash-lesson.tar.gz
--2022-10-26 16:57:52--  http://www.hpc-carpentry.org/hpc-shell/files/bash-lesson.tar.gz
Resolving www.hpc-carpentry.org (www.hpc-carpentry.org)... 172.67.146.136, 104.21.33.152
Connecting to www.hpc-carpentry.org (www.hpc-carpentry.org)|172.67.146.136|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12534006 (12M) [application/gzip]
Saving to: ‘bash-lesson.tar.gz.1’

100%[========================>] 12,534,006  32.9MB/s   in 0.4s   

2022-10-26 16:57:53 (32.9 MB/s) - ‘bash-lesson.tar.gz.1’ saved [12534006/12534006]

we can expand the compressed file

[brownsarahm@seawulf ~]$ tar -xvf bash-lesson.tar.gz
dmel-all-r6.19.gtf
dmel_unique_protein_isoforms_fb_2016_01.tsv
gene_association.fb
SRR307023_1.fastq
SRR307023_2.fastq
SRR307024_1.fastq
SRR307024_2.fastq
SRR307025_1.fastq
SRR307025_2.fastq
SRR307026_1.fastq
SRR307026_2.fastq
SRR307027_1.fastq
SRR307027_2.fastq
SRR307028_1.fastq
SRR307028_2.fastq
SRR307029_1.fastq
SRR307029_2.fastq
SRR307030_1.fastq
SRR307030_2.fastq

14.3. Working with large files#

[brownsarahm@seawulf ~]$ cat SRR307023_1.fastq

This was a very long output, so it is removed from the notes.

[brownsarahm@seawulf ~]$ head SRR307023_1.fastq
@SRR307023.37289418.1 GA-C_0019:1:120:7087:20418 length=101
CGAGCGACTTTTGTATAACTATATTTTTCTCGTTCTTGGCTCCGACATCTATACAAATTCAGAAGGCAGTTTTGCGCGTGGAGGGACAATTACAAATTGAG
+SRR307023.37289418.1 GA-C_0019:1:120:7087:20418 length=101
GGFGGFHHHHFEBH@G@EGEGHHEHGHHHHHHHHHBHDGHHGDGFHH?H:DFEDEGGGDGD:DBDA=@BBE@G?D?FD8FDGGDD<:BAB4;4;>CF@B3B
@SRR307023.37289419.1 GA-C_0019:1:120:7127:20414 length=101
GGGCAGGGCCGTAATCAATGCCCCTCAAACACAAAGTTACTGGGAATTCCAAGTTCATGTGAACAGTTTCAGTTCACAATCCCAAGCATGAAAGGGGTTCA
+SRR307023.37289419.1 GA-C_0019:1:120:7127:20414 length=101
HEHGHHHHH@GGBGBHEH@B=EGDGAADDDEFBEFF8:EB(/,257;.;;94/4.>@3@7?A=:C@72;1@;4;A688:<E@GGG<EG3A###########
@SRR307023.37289420.1 GA-C_0019:1:120:7204:20410 length=101
CCACCCACTTAGTGCTGCACTATCAAGCAACACGACTCTTTTGAAACATCATCTAGTAATCATTAACGTTATACGGGCCTGGCACCCTCTATGGGTAAATG
[brownsarahm@seawulf ~]$ tail SRR307023_1.fastq
+SRR307023.37294415.1 GA-C_0019:1:120:17521:20714 length=101
IIIIIIHIIIHIIIIFFFF#?@;B#############################################################################
@SRR307023.37294416.1 GA-C_0019:1:120:17598:20714 length=101
GGGGCACCCACATTATACANAACCANNNANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SRR307023.37294416.1 GA-C_0019:1:120:17598:20714 length=101
IDIG?IHBGIIDIHIDDDD#@=@##############################################################################
@SRR307023.37294417.1 GA-C_0019:1:120:18444:20714 length=101
GGGGCAGTGCTAAGGTACTNGAAAGNNNCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SRR307023.37294417.1 GA-C_0019:1:120:18444:20714 length=101
IIIGIIIHIHHIIIIEFFF#?################################################################################
[brownsarahm@seawulf ~]$ wc -l SRR307023_1.fastq
20000 SRR307023_1.fastq
[brownsarahm@seawulf ~]$ wc -l *.fastq
   20000 SRR307023_1.fastq
   20000 SRR307023_2.fastq
   20000 SRR307024_1.fastq
   20000 SRR307024_2.fastq
   20000 SRR307025_1.fastq
   20000 SRR307025_2.fastq
   20000 SRR307026_1.fastq
   20000 SRR307026_2.fastq
   20000 SRR307027_1.fastq
   20000 SRR307027_2.fastq
   20000 SRR307028_1.fastq
   20000 SRR307028_2.fastq
   20000 SRR307029_1.fastq
   20000 SRR307029_2.fastq
   20000 SRR307030_1.fastq
   20000 SRR307030_2.fastq
  320000 total
[brownsarahm@seawulf ~]$ wc -l dmel-all-r6.19.gtf
542048 dmel-all-r6.19.gtf
[brownsarahm@seawulf ~]$ grep Act5c dmel-all-r6.19.gtf
[brownsarahm@seawulf ~]$ grep mRNA dmel-all-r6.19.gtf

(truncated output)

X	FlyBase	mRNA	19961689	19968479	.	+gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0070000"; transcript_symbol "Nep3-RA";
X	FlyBase	mRNA	19963176	19968479	.	+gene_id "FBgn0031081"; gene_symbol "Nep3"; transcript_id "FBtr0307554"; transcript_symbol "Nep3-RB";
[brownsarahm@seawulf ~]$ echo "echo 'script works'" > demo.sh
[brownsarahm@seawulf ~]$ cat demo.sh
echo 'script works'
[brownsarahm@seawulf ~]$ ./demo.sh
script works
[brownsarahm@seawulf ~]$ ls -l
total 150704
-rw-r--r--. 1 brownsarahm spring2022-csc392 12534006 Apr 18  2021 bash-lesson.tar.gz
-rw-r--r--. 1 brownsarahm spring2022-csc392 12534006 Apr 18  2021 bash-lesson.tar.gz.1
-rwxr-xr-x. 1 brownsarahm spring2022-csc392       20 Oct 26 17:11 demo.sh
-rw-r--r--. 1 brownsarahm spring2022-csc392 77426528 Jan 16  2018 dmel-all-r6.19.gtf
-rw-r--r--. 1 brownsarahm spring2022-csc392   721242 Jan 25  2016 dmel_unique_protein_isoforms_fb_2016_01.tsv
-rw-r--r--. 1 brownsarahm spring2022-csc392 25056938 Jan 25  2016 gene_association.fb
-rw-r--r--. 1 brownsarahm spring2022-csc392      447 Mar  8  2022 linecounts.txt
-rw-r--r--. 1 brownsarahm spring2022-csc392       84 Mar  8  2022 my_job.sh
drwxr-xr-x. 2 brownsarahm spring2022-csc392       10 Mar  8  2022 results
-rw-r--r--. 1 brownsarahm spring2022-csc392       89 Mar  8  2022 slurm-23950.out
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625262 Jan 25  2016 SRR307023_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625262 Jan 25  2016 SRR307023_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625376 Jan 25  2016 SRR307024_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625376 Jan 25  2016 SRR307024_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625286 Jan 25  2016 SRR307025_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625286 Jan 25  2016 SRR307025_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625302 Jan 25  2016 SRR307026_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625302 Jan 25  2016 SRR307026_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625312 Jan 25  2016 SRR307027_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625312 Jan 25  2016 SRR307027_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625338 Jan 25  2016 SRR307028_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625338 Jan 25  2016 SRR307028_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625390 Jan 25  2016 SRR307029_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625390 Jan 25  2016 SRR307029_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625318 Jan 25  2016 SRR307030_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625318 Jan 25  2016 SRR307030_2.fastq
[brownsarahm@seawulf ~]$ chmod +x demo.sh
[brownsarahm@seawulf ~]$ ls -l
total 150704
-rw-r--r--. 1 brownsarahm spring2022-csc392 12534006 Apr 18  2021 bash-lesson.tar.gz
-rw-r--r--. 1 brownsarahm spring2022-csc392 12534006 Apr 18  2021 bash-lesson.tar.gz.1
-rwxr-xr-x. 1 brownsarahm spring2022-csc392       20 Oct 26 17:11 demo.sh
-rw-r--r--. 1 brownsarahm spring2022-csc392 77426528 Jan 16  2018 dmel-all-r6.19.gtf
-rw-r--r--. 1 brownsarahm spring2022-csc392   721242 Jan 25  2016 dmel_unique_protein_isoforms_fb_2016_01.tsv
-rw-r--r--. 1 brownsarahm spring2022-csc392 25056938 Jan 25  2016 gene_association.fb
-rw-r--r--. 1 brownsarahm spring2022-csc392      447 Mar  8  2022 linecounts.txt
-rw-r--r--. 1 brownsarahm spring2022-csc392       84 Mar  8  2022 my_job.sh
drwxr-xr-x. 2 brownsarahm spring2022-csc392       10 Mar  8  2022 results
-rw-r--r--. 1 brownsarahm spring2022-csc392       89 Mar  8  2022 slurm-23950.out
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625262 Jan 25  2016 SRR307023_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625262 Jan 25  2016 SRR307023_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625376 Jan 25  2016 SRR307024_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625376 Jan 25  2016 SRR307024_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625286 Jan 25  2016 SRR307025_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625286 Jan 25  2016 SRR307025_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625302 Jan 25  2016 SRR307026_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625302 Jan 25  2016 SRR307026_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625312 Jan 25  2016 SRR307027_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625312 Jan 25  2016 SRR307027_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625338 Jan 25  2016 SRR307028_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625338 Jan 25  2016 SRR307028_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625390 Jan 25  2016 SRR307029_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625390 Jan 25  2016 SRR307029_2.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625318 Jan 25  2016 SRR307030_1.fastq
-rw-r--r--. 1 brownsarahm spring2022-csc392  1625318 Jan 25  2016 SRR307030_2.fastq
[brownsarahm@seawulf ~]$ ./demo.sh
script works

14.4. Logging out#

[brownsarahm@seawulf ~]$ exit
logout
Connection to seawulf.uri.edu closed.

14.5. Creating SSH Keys#

6base) brownsarahm@systems $ ssh-keygen -f ~/seawulf -t rsa -b 409

Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /Users/brownsarahm/seawulf.
Your public key has been saved in /Users/brownsarahm/seawulf.pub.
The key fingerprint is:
SHA256:r9DnJbLGUYbyikPqn35IDKm1+clY7r64Zv3rhitohDs brownsarahm@15.255.20.172.s.wireless.uri.edu
The key's randomart image is:
+---[RSA 4096]----+
|                 |
|                 |
|   .     .       |
|  +   . . o      |
|.o =   oSo       |
|o.o =  .o.       |
|.o @ =.oo.+ .    |
|E.*.@.+.o* o     |
|.=+BOO+oo .      |
+----[SHA256]-----+
seawful.uri.edu
brownsarahm@systems $ ssh-copy-id -i ~/seawulf brownsarahm@sseawfulf.uri.edu
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/Users/brownsarahm/seawulf.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
brownsarahm@seawulf.uri.edu's password:

Number of key(s) added:        1

Now try logging into the machine, with:   "ssh 'brownsarahm@seawulf.uri.edu'"
and check to make sure that only the key(s) you wanted were added.

Last login: Wed Oct 26 16:54:36 2022 from 172.20.255.15
[brownsarahm@seawulf ~]$ interactive
salloc: Granted job allocation 26350
salloc: Waiting for resource configuration
salloc: Nodes n005 are ready for job
[brownsarahm@n005 ~]$ lshw
[brownsarahm@n005 ~]$ exit
logout
salloc: Relinquishing job allocation 26350
[brownsarahm@seawulf ~]$ client_loop: send disconnect: Broken pipe

14.6. Removing long outputs.#

I generate the notes from the terminal export, so all of the long things we did in class were in the file I start from. I removed them, by splitting the file into pieces: taking the part I wanted off the top, putting only the end after the long output in a new file, and then taking the top of that new file a few times to get down to what we have here (and then running my processing script, and then adding explanations).

This requires using some really helpful options on grep head and tail

For example:

grep -n lshw 2022-10-26-t2.md

gives:

126:[brownsarahm@n005 ~]$ lshw

It gives the occurence and the line number that it was on.

Then I used:

head -n 126 2022-10-26-t2.md > notes/2022-10-26-3.md

To take the first 126 lines into a new file.

Then I looked for the last “exit”:

grep -n "exit" 2022-10-26-t2.md
75:[brownsarahm@seawulf ~]$ exit
3297:[brownsarahm@n005 ~]$ exit

and checked the total length of what I had left

wc -l 2022-10-26-t2.md
3301 2022-10-26-t2.md

3297 is near the end of 3301, so this tail would be the last bit of output I wanted. So I used:

tail -n +3297 2022-10-26-t2.md  > notes/2022-10-26-4.md

This takes the lines starting on line 3297 until the end.

The + sign is important. This:

tail -n +3297 2022-10-26-t2.md

would have given me the last 3297 lines.

Then I used cat with redirects to combine the files after I checked that they had what I wanted.

14.7. Review today’s class#

  1. Review the notes (to reinforce ideas and improve your memory of them)

  2. create a bash script that counts the number of lines in each .fastq file, saves the line counts to a text file, then finds how many times Act5C is in dmel-all-r6.19.gtf on the server. (save this on the server; we’ll get it downloaded locally later)

  3. Configure ssh keys to your GitHub account (this is actually GitHub’s preferred terminal authentication method)

  4. Contribute to your group repo and review someone else’s contribution

14.8. Prepare for Next Class#

  1. Make sure that you have a C compiler installed on your computer. ideally gcc

  2. Update your KWL chart. bring questions on anything we have covered so far.

14.9. More Practice#

  1. Create an sbatch script to run your script from the review on a compute node. see the options official sbatch docs while URI docs are down

  2. priority File permissions are represented numerically often in octal, by transforming the permissions for each level (owner, group, all) as binary into a number. Add octal.md to your KWL repo and answer the following. Try to think through it on your own, but you may look up the answers, as long as you link to (or ideally cite using jupyterbook citations) a high quality source.

    1. Transform the permissions [`r--`, `rw-`, `rwx`] to octal, by treating it as three bits.
    1. Transform the permission we changed our script to `rwxr-xr-x` to octal.
    1. Which of the following would prevent a file from being edited by other people 777 or 755?
    
  3. Answer the following in hpc.md of your KWL repo: (to think about how the design of the system we used in class impacts programming and connect it to other ideas taught in CS)

    1. What kinds of things would your code need to do if you were going to run it on an HPC system?
    1. What Sbatch options seem the most helpful?
    1. How might you go about setting the time limits for a script? How could you estimate how long it will take?
    
  4. Create ssh.md and include either # 3 tips for SSH keys or # 3 things to avoid with SSH keys. For each tip/misconceptions include an explanation of why it is important/wrong.

Extension ideas

  1. use the server for something for another class or a side project

  2. extend your ssh.md file to icndlue both tips and misconceptions and/or more.

14.10. Questions After Class#

14.10.1. When connecting an ssh key to GitHub, is there a best practice? Should we look at the GitHub website for that?#

GitHub has a tutorial on how to do it. They do recommend a specific (different than we used here) encryption algorithm.

14.10.2. i’d love to know more about cryptography and encryption for passwords#

You can use this as a deeper exploration topic. Also Professor Lamanga teaches a class on this topic.

14.10.3. one question I have is: what can we do with these servers, is it just for computation power for things we can’t do on our personal computers?#

This particular server is for more compute power. Different servers have different purposes. This is a good question. You can use this as a deeper exploration if you want. Also, I will see if we have time to spend some more class time on this.

14.10.4. Is there an easier way for me to search the course website to find specific commands? Like when I tried to do that earlier in the class about reading the first few lines of a file.#

The website is not currently indexed very well, but hopefully your group repo has a cheatsheet of the commands. You can add to it if it does not to make that more helfpul a a quick reference.

14.10.5. can I use c++ or c on the server since I already have the compiler?#

To use it on the server, the server has to have the compiler, not your local computer. In terms of running code you compiled locally on a different machine, that is tough. Not impossible, but requires great care.