FAQ

We are in the process of reviewing the information below and integrating it in to our new documentation.

  1. Why do I get a connection timed out or Connection closed by remote host error
  2. Why am I asked for a password when I try to log on?
  3. Can I logon to JASMIN from home?
  4. Why do I get a warning saying REMOTE HOST IDENTIFICATION HAS CHANGED?
  5. I can't remember the passphrase associated with my private key
  6. How do I logon to a JASMIN/CEMS Virtual Machine?
  7. Why am I not able to write any more files to my home directory?
  8. Listing files/directories and checking sizes: ls, du and pan_du
  9. How much space is left in my Group Workspace?
  10. Is it possible to recover a deleted file?
  11. How do I access the CEDA archive?
  12. What data can I access from JASMIN/CEMS?
  13. How do I transfer my data to JASMIN/CEMS?
  14. How can I transfer data in and out of JASMIN over GridFTP
  15. Rsyncing data/to from JASMIN/CEMS transfer servers
  16. How do I set up a cron job?
  17. I see file names like .panfs.cc7410a.1113355902480856000. Why can't I delete them?
  18. How do I access the web from a shared analysis VM?
  19. How do I clone a github repository from a shared analysis VM?
  20. How do I install Python packages myself
  21. What text editors are available?
  22. How to acknowledge the JASMIN team
  23. Why can't I find common Python modules on the *sci machines?
  24. User X is taking all the resources on jasmin-sciN: please tell them to desist!

1. Why do I get a connection timed out or Connection closed by remote host error

The first thing to check is that you indeed have a JASMIN/CEMS account set up with us. To access JASMIN machines you need to have registered for either jasmin-login, cems-login or commercial-login onto you CEDA account. Please check your current groups at MyCEDA.

If upon trying to log on to jasmin-login1/jasmin-xfer1 or cems-login1/cems-xfer1, you get a "ssh_exchange_identification: Connection closed by remote host" or "operation timed out" error, this means that there is an authentication problem. JASMIN cannot identify you and so it ends your session immediately.

Check the following options to try to resolve the problem:

Option 1 - Where are you logging in from?

If you are trying to logon to JASMIN from outside of the *.ac.uk network domain , e.g. a UK non-academic network or from abroad, you will not be able to successfully log in to JASMIN/CEMS until we have “whitelisted” the network domain of the host from which you are connecting. Please note that you are required to connect from a host with a reverse DNS lookup. To check your reverse DNS lookup, you can point your browser to:

http://www.lawrencegoetz.com/programs/ipinfo/

This will give the IP address seen by machines you connect to such as jasmin-login1. Then follow the link to "reverse Lookup".

Option 2 - Are you using the correct public key?

Check that you are using the correct public key by typing on your local unix machine:

$ ssh-add –L
				

(note capital L)

This should print the public key that is being used and then you can check that it matches the one uploaded into your CEDA account via MyCEDA. Both public key should match.

Please note that a public/private key should be created on the machine you want to logon from (but it is not locked to the machine where it was generated). It may be copied to any machine which is used as the first step for initiating a session (e.g. office desktop / laptop) but the act of uploading the public key to your MyCEDA profile will ensure that the key is copied automatically to all JASMIN machines to which you have been granted permission to log in. You should not copy the public key manually to the ~/.ssh/authorized_keys file on any JASMIN machine as this will be over-ridden by automated processes. On your desktop/laptop, place the private key in the SSH authentication agent (e.g. ssh-agent on unix, pageant on windows) session on the machine you want to log on from.

If you are still having difficulties logging on, email CEDA Support with a full transcript of your logging session (ssh -a -v username@jasmin-login1.ceda.ac.uk) and make a note of the time you attempted to logon, so that we can check the relevant logs at our end.

2. Why am I asked for a password when I try to log on?

You should not be asked for a password when attempting to logon to JASMIN. There are 2 occasions when you may be prompted to enter a password, both indicating a problem:

1 - upon logging on to jasmin-login1/jasmin-xfer1 or cems-login1/cems-xfer1

If you are asked for a password when trying to logon onto jasmin-login1/jasmin-xfer1 then it is likely to be an SSH public key related issue.

Please make sure that your public SSH key has been correctly uploaded into myCEDA account (https://services.ceda.ac.uk/cedasite/myceda). You can check that you are using the correct public key by typing:

$ ssh-add –L
				

(note capital L). This should print the public key that is being used.

Please note that public SSH keys containing line break or escape (perhaps as a result of copying and pasting) are likely to cause problems. Please take care to eliminate this characters when uploading your public key.

Please note that your MyCEDA profile only allows the storing of 1 key per user. Keys may not be shared between users and must be protected by a strong passphrase.

You can also list the keys you currently have loaded with command:

$ ssh-add -l
				

Please note that a public/private key should be created on the machine you want to logon from (but it is not locked to the machine where it was generated). It may be copied to any machine which is used as the first step for initiating a session (e.g. office desktop / laptop) but the act of uploading the public key to your MyCEDA profile will ensure that the key is copied automatically to all JASMIN machines to which you have been granted permission to log in. You should not copy the public key manually to the ~/.ssh/authorized_keys file on any JASMIN machine as this will be over-ridden by automated processes. On a Linux/Unix machine, make sure that you place the public key in the ~/.ssh/authorized_keys file on the machine you want to log onto. On your desktop/laptop, place the private key in the SSH authentication agent (e.g. ssh-agent on unix, pageant on windows) session on the machine you want to log on from.

2 - upon logging onto virtual machines from jasmin-login1/xfer1 or cems-login1/xfer1

If you are able to logon to jasmin-login1 or cems-login1 but are asked for a password upon attempting to log on to a Virtual Machine on the JASMIN/CEMS network, then provided that you have indeed authorized access to the Virtual machine you are attemtping to get to, the problem is due to lack of agent forwarding.

If you are using unix, please make sure that the ssh-agent is running on the machine where the private key is stored and the private key needs to have been added to that agent (ssh-add). Be sure to use the ssh -A option to enable agent forwarding to enable subsequent login to jasmin-sci1 from jasmin-login1 for example.

From the machine where the private key is stored, type:

$ ssh-add
				

(to load the key into the local ssh-agent) followed by

ssh -A <username>@jasmin-login1.ceda.ac.uk
				

So, if your initial machine (desktop/laptop) is Linux-based, you can do something like:

$ exec ssh-agent $SHELL
				$ ssh-add ~/.ssh/id_rsa_jasmin (if id_rsa_jasmin is the name of your key. You will be prompted for your passphrase)
				$ ssh -A username@jasmin-login1.ceda.ac.uk[username@jasmin-login1 ]
				$ ssh username@jasmin-sci1.ceda.ac.uk
				

If your initial machine is running Windows, a suitable tool to use as authentication agent is Pageant, part of the PuTTY package.

From the Putty Pageant icon, right click to select Add Key, specify your private key location and enter the passphase when prompted.Then before you open a logging session using Putty on windows, please make sure that your putty configuration has "Allow agent forwarding" set. This can be found under the "connection"->"ssh->"auth" menu.

For Mac users, please check that your configuration file (~/.ssh/config and/or /etc/ssh_config) contains a line that says "ForwardAgent yes".

Please note that your ssh-agent session will persist until killed or until system shutdown, even if you close the terminal in which you set it up.

Further details about logging on to JAMSIN.

3. Can I logon to JASMIN from home?

For security reasons, we do not allow JASMIN login from home computers on a domestic Internet Service Provider (ISP).

If however you can remotely get onto your institute network from home (via VPN), then you should be able to access JASMIN this way from home or anywhere else.

Note that the user can copy their private key to different machines, allowing the user to login from more than one place (provided the login to JASMIN is initiated from a whitelisted network domain such as the UK academic network).

Copying the keys around is fine as long as the key has a strong passphrase and there is only one unique key per user (not one per machine). Passwordless keys must not be used as they could allow access to your account, your data and the data of others in your workspace by unauthorised people. For convenience, you are encouraged to make use of an SSH authentication agent e.g. ssh-agent or Pageant to avoid having to keep typing the passphrase for your private key at each step.

For more details, please see our documentation about porting SSH Keys.

4. Why do I get a warning saying REMOTE HOST IDENTIFICATION HAS CHANGED?

This message may be seen after some system updates when the unique key identifiying the machine has been reset. Care should be taken, as potentially it could indicate that the identity of the remote host has been “spoofed” and that you could be logging in to a hacker’s duplicate of that machine. Normally, however, the CEDA / JASMIN teams will have posted news and/or other information regarding system updates and it is usually safe to proceed by doing the following. If in any doubt, please check with the helpdesk.

Before you log on the JASMIN server, please update the ~/.ssh/known_hosts file in your calling machine by using an editor to remove all JASMIN entries.

A typical JASMIN server entry looks like:

?jasmin-login1.ceda.ac.uk,130.246.142.229 ssh-rsa AAAAB?.==?
				

Please remove it and then save and exit the editing session.

Once you have removed the JASMIN entries, you should be able to log on as before, although because the host will no longer exist in the known_hosts file, you will see a message implying that this is the first time you have connected to this machine and asking you if this should be added to the known_hosts file. If you are happy to proceed, answer yes. If you answer no, your ssh client will not continue with the login process.

5. I can't remember the passphrase associated with my private key

The passphrase that protects your private key is on your local machine only and is not stored, used or updated on any of the jasmin machines.

If you cannot remember your passphrase associated with your private key then you will have to generate a new pair of SSH keys. Then remember to upload your new public key to MyCEDA. The update should be validated throughout the system within an hour.

It is possible to update the passphrase associated with your private key using the -p option with ssh-keygen (see the ssh-keygen man page for details). However this requires you to know the existing passphrase. In this case the public key does not need to be re-uploaded, as you will have only updated the passphrase on the private half of the key (but would need to be copied to any other machines where you store your private key).

6. How do I logon to a JASMIN/CEMS Virtual Machine?

Provided you have received confirmation from CEDA Support that you have access to the relevant Virtual Machine, you'll need to login via jasmin-login1.ceda.ac.uk or cems-login1.ceda.ac.uk (depending which server you have been granted access to).

Jasmin-login1 is the front door to JASMIN VMs. It has no function other than it acts as a single front door to the JASMIN VM network. The equivalent for CEMS is cems-login1.cems.rl.ac.uk, and for industrial research partner users us comm-login1.cems.rl.ac.uk. These SSH gateway or “bastion” servers should not be used for any processing or data transfers: there are other, more suitably-resourced machines for providing these services.

Data transfer nodes jasmin-xfer1, cems-xfer1 may be used directly from outside the firewall (without going via jasmin-login1/cems-login1). This is to enable users to transfer data to and from Group Workspaces via appropriate protocols such as scp, rsync and bbcp (and eventually GridFTP). All the JASMIN Group Workspaces (GWS) are visible (and read/writeable to those users with permission) on the jasmin-xfer1 VM. Again, the same applies to CEMS. An additional machine, jasmin-xfer2 provides a high-performance data transfer service for users with particulatly large and/or high-speed transfer needs, but needs specific authorisation to use.

Jasmin-sci1 is used for running analysis/processing code on data. It is inside the firewall so you can only get to it from jasmin-login1. All the JASMIN GWSs are visible (and read/writeable to those with permission) on jasmin-sci1 so that scientists can run their code and write their outputs. Much of the CEDA archive ("/badc" and "/neodc") is also visible (read-only) on jasmin-sci1 so that people can run their code using the archive as input data.

Essentially, in the case of getting onto the JASMIN VM network, the user should proceed as follows:

$ exec ssh-agent $SHELL
				$ ssh-add ~/.ssh/id_rsa
				$ <passphrase>
				$ ssh -A -X <userid>@jasmin-login1.ceda.ac.uk
				$ ssh -X <userid>@jasmin-sci1.ceda.ac.uk
				

The -X option enables X11 forwarding, -Y enables “trusted” X11 forwarding (see man page for further details)

You are welcome to use either –X or –Y. You may get a warning when using -X.

The more encryption you do the harder machines at each end of the link have to work, so there is a performance tradeoff. We suggest that you try both and see what works best for you.

Please note that your ssh-agent session will persist until killed or until system shutdown, even if you close the terminal in which you set it up.

7. Why am I not able to write any more files to my home directory?

All machines have user home areas mounted, though these are limited to 10 Gb by default, so this area should not be used for storing large quantities of data, however the area IS backed up, so is appropriate for important, small files or source code personal to you. You can check your disk usage current usage and quota using e.g. the command:

pan_quota
				

If you happen to exceed your home disc space quota (10GB) on JASMIN, you will not be able to write any more files. You should be using the Group Workspace to store files. To get back into quota, you will need to delete (or move the files to the GWS).

Please note an important difference with respect to standard computing environments: /tmp is only available to a local compute node. You should be aware that parallel jobs will therefore be accessing different disks at /tmp. Despite what you may be used to, IT WILL BE FASTER AND LESS OF A LOAD ON OTHER USERS to access temporary files in a Group Workspace (which is optimised for fast I/O) rather than /tmp (which is not). In general you should not use /tmp because (i) it is small and (ii) it is only visible to the processing node where your job is running.

/work/scratch is a shared area of high-performance disk but ONLY for use by the LOTUS batch processing cluster for working files and should not be used by individual users as general storage space anymore. We have now set up 2 “generic” GWSs to cater for users who do not belong to Projects which have group workspaces: ncas_generic and nceo_generic). Users are welcome to apply for membership of these workspaces.

For LOTUS users of /work/scratch, please create a /work/scratch/<your.userid> directory (from jasmin-sci1 or project dedicated VMs) and use that for your temporary work space. The same directory is mounted on all machines so you may want to create per-job directories to isolate each job.

8. Listing files/directories and checking sizes: ls, du and pan_du

The ls command shows space allocated for file. In other words, lsis reporting the "size" of the file as being the apparent size.

du normally shows amount of space actually used by file, however for the Panasas storage system used by JASMIN/CEMS, the alternative command pan_du must be used (when the filesystem is mounted using panfs, which is normally the case on JASMIN). So pan_du uses the actual number of disk blocks in use.

Each layer of abstraction on top of individual bits and bytes results in wasted space when a data file is smaller than the smallest data unit the file system is capable of tracking. This wasted space within a sector, cluster, or block is commonly referred to as slack space, and it cannot normally be used for storage of additional data. So files don't have to neatly fit into blocks. So for example, assuming a block size of 1024, if a file was 1024 bytes, its size in ls and du would be 1024. If the file size was 1025, the size would be 1025 in ls and 2048 in du.

The slack space on JASMIN can be different to the slack space on your source computer - hence the difference in du size.

9. How much space is left in my Group Workspace?

This information has been migrated to: http://help.ceda.ac.uk/article/203-managing-a-gws 

10. Is it possible to recover a deleted file?

The ONLY areas that have routine backups are /home/users and the system discs of machines that are visible to the outside world (eg jasmin-login1.ceda.ac.uk). There is a daily incremental and weekly full backup in place on /home/users. However be aware that you only have a 10GB quota on /home/users so /home/users is not a place to store data sets. Under normal circumstances, we would aim to restore files on /home/users within 1-2 days.

“.snapshot” directories under /home/users/ : users can use snapshots to recover files from if they accidentally delete them (ls .snapshot will list <date>.home_users directories containing files as they were on that date.) File(s) can then be copied back from one of these directories to their original location. Note that these directories will not appear in a normal ls listing. It is necessary to explicitly refer to the snapshot directory by name:

[user@hostname ~]$ ls .snapshot
2016.02.04.23.31.01.home_users  2016.02.08.23.31.02.home_users
2016.02.05.23.31.09.home_users  2016.02.09.23.31.02.home_users
2016.02.06.23.31.02.home_users  2016.02.10.23.31.14.home_users
2016.02.07.23.31.03.home_users
				

GWS managers may request snapshots to be enabled on their GWS but they need to be sure they have a backup routine in place via another route if the data is critical.

Group workspaces are NOT BACKED UP

It is the responsibility of the Group Workspace manager to look after the data in the workspace. Tools are provided in the Elastic Tape service for GWS managers to use commands to copy or move all or part of a GWS to near-line tape and back again, either to create a secondary copy of the data on tape, or to move portions of the data to tape to free up (expensive) high-performance disk. If you are a user of a GWS and need to have a secondary copy made of your data, please speak to your GWS manager to ask for this to be done.

11. How do I access the CEDA archive?

The archive is deliberately not mounted onto jasmin-login1 nor cems-login1.

jasmin-login1 is just meant to be a login gateway to either jasmin-sci1 (for working), jasmin-xfer1 (for transferring data in/out) or project-specific VMs. The same applies to CEMS.

The BADC/NEODC archives are available from jasmin/cems-xfer1, and jasmin/cems-sci1. Unix-group permissions act PARTIALLY to restrict your access to those datasets which you have permission to access. You may only access/download datasets for which you explicitly have a current dataset access role / license, for the purpose/project you mentioned in your dataset application. If you do not have a current dataset access role for a dataset you come across in the filesystem, or your intended use does not match that which you mentioned when applying, please first submit an application for access to the dataset via MyCEDA (https://services.ceda.ac.uk/cedasite/myceda) before accessing or downloading the data, otherwise you risk bring the terms and conditions of that dataset.

BADC data are available at /badc

NEODC data are available at /neodc

12. What data can I access from JASMIN/CEMS?

You may have a login onto jasmin-login1 or cems-login1 but may not have any permissions to access the archive. You have to be granted permissions onto other machines where the archive is mounted, or use services like ftp.ceda.ac.uk which have their own authentication mechanism for archive access.

If you are a member of a restricted dataset then you should be able to see the data archive from jasmin/cems-sci1 (or jasmin/cems-xfer1).

You can check which restricted datasets you currently have access to via MyCEDA.

13. How do I transfer my data to JASMIN/CEMS?

jasmin-xfer1.ceda.ac.uk and cems-xfer1.cems.rl.ac.uk are data transfer machines which enable direct data transfer of files in group workspaces (without going via jasmin-login1/cems-login1). Data may be transferred via scp, rsync-over-ssh and bbcp.

All the JASMIN/CEMS Group Workspaces (GWS) are visible (and read/writeable to those users with permission) on the jasmin-xfer1/cems-xfer1 VM.

To transfer data to the JASMIN Group Workspace xxx using rsync:

$ rsync  username@jasmin-xfer1.ceda.ac.uk:/group_workspaces/jasmin/gws_xxx/ 

14. How can I transfer data in and out of JASMIN over GridFTP

DRAFT: this method is still being configured

You can run GridFtp? using ssh as your control connection to JASMIN. You will need an installation of the ​globus toolkit with GridFTP client support. Version 5.2.4 is tested on JASMIN.

Depending you your globus installation you may need to start by setting up globus. If you don't have globus-url-copy in your path try something like:

$ export GLOBUS_LOCATION=/usr/local/globus-5.2
				$ source $GLOBUS_LOCATION/etc/globus-user-env.sh
				

The JASMIN machine jasmin-xfer1.ceda.ac.uk is configured to allow GridFTP access over ssh. Test you have access by listing your a directory. You will need your JASMIN ssh key on your keychain for this to work:

$ globus-url-copy -list sshftp://spascoe@jasmin-xfer1.ceda.ac.uk/badc/cmip5/data/
				sshftp://spascoe@jasmin-xfer1.ceda.ac.uk/badc/cmip5/data/
				    .ftpaccess
				    GeoMIP/
				    cmip5/
				    pmip3/
				    tamip/
				

NOTE: You should always absolute paths with globus-url-copy and when specifiying directories always end a URL with "/".

Now we can try downloading one variable from CMIP5. This will employ recursive directory download.

$ time globus-url-copy -v -r sshftp://spascoe@jasmin-xfer1.ceda.ac.uk/badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES/rcp85/mon/land/Lmon/r1i1p1/latest/landCoverFrac/ file://$PWD/
				Source: sshftp://spascoe@jasmin-xfer1.ceda.ac.uk/badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES/rcp85/mon/land/Lmon/r1i1p1/latest/landCoverFrac/
				Dest:   file:///home/spascoe/expt/gridftp/test_xfer/
				  landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_200512-203011.nc
				Source: sshftp://spascoe@jasmin-xfer1.ceda.ac.uk/badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES/rcp85/mon/land/Lmon/r1i1p1/latest/landCoverFrac/
				Dest:   file:///home/spascoe/expt/gridftp/test_xfer/
				  landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_203012-205511.nc

				...

				Source: sshftp://spascoe@jasmin-xfer1.ceda.ac.uk/badc/cmip5/data/cmip5/output1/MOHC/HadGEM2-ES/rcp85/mon/land/Lmon/r1i1p1/latest/landCoverFrac/
				Dest:   file:///home/spascoe/expt/gridftp/test_xfer/
				  landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_229912-229912.nc

				real	1m8.771s
				user	0m1.064s
				sys	0m10.321s

				$ ls -l ; du -hs .
				total 3455372
				-rw-rw-r-- 1 spascoe spascoe 300696884 Jun 11 12:13 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_200512-203011.nc
				-rw-rw-r-- 1 spascoe spascoe 300696900 Jun 11 12:13 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_203012-205511.nc
				-rw-rw-r-- 1 spascoe spascoe 300696916 Jun 11 12:13 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_205512-208011.nc
				-rw-rw-r-- 1 spascoe spascoe 229536172 Jun 11 12:13 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_208012-209912.nc
				-rw-rw-r-- 1 spascoe spascoe 300696520 Jun 11 12:13 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_209912-212411.nc
				-rw-rw-r-- 1 spascoe spascoe 300696520 Jun 11 12:14 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_212412-214911.nc
				-rw-rw-r-- 1 spascoe spascoe 300696520 Jun 11 12:14 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_214912-217411.nc
				-rw-rw-r-- 1 spascoe spascoe 300696520 Jun 11 12:14 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_217412-219911.nc
				-rw-rw-r-- 1 spascoe spascoe 300696520 Jun 11 12:14 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_219912-222411.nc
				-rw-rw-r-- 1 spascoe spascoe 300696520 Jun 11 12:14 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_222412-224911.nc
				-rw-rw-r-- 1 spascoe spascoe 300696520 Jun 11 12:14 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_224912-227411.nc
				-rw-rw-r-- 1 spascoe spascoe 300696520 Jun 11 12:14 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_227412-229911.nc
				-rw-rw-r-- 1 spascoe spascoe   1019584 Jun 11 12:14 landCoverFrac_Lmon_HadGEM2-ES_rcp85_r1i1p1_229912-229912.nc
				3.3G	.

				

I.e. 3.3G was transferred accross the RAL network in 69s => 3.8GBps. This was to a desktop machine at RAL so we don't expect too much.

For further information on globus-url-copy see the ​Globus User Guide.

15. Rsyncing data/to from JASMIN/CEMS transfer servers

Copying Data using Transfer Host(s)

You should use one of the following transfer hosts to copy files/directories to/from Group Workspaces on JASMIN/CEMS:

  • jasmin-xfer1.ceda.ac.uk
  • cems-xfer1.ceda.ac.uk

You should be able to rsync data to a Group Workspace following these instructions from your local Unix terminal:

  $ bash

				  # Start an ssh-agent session and add your private key
				  $ exec ssh-agent $SHELL
				  $ ssh-add ~/.ssh/<YOUR_SSH_PRIVATE_KEY>

				  # Rsync to the "xfer1" VM
				  $ rsync SOME_FILE <USERNAME>@jasmin-xfer1.ceda.ac.uk:/group_workspaces/jasmin/<YOUR_GROUP_WORKSPACE>/
				

16. How do I set up a cron job?

There is no cron facility available to users on the shared JASMIN/CEMS shared science analysis machines or transfer machines (e.g. jasmin-sci1, jasmin-xfer1) but cron jobs can be set up on project dedicated VMs. Please also consult the LOTUS documentation for information about job scheduling, which may be helpful in designing your workflow.

17. I see file names like .panfs.cc7410a.1113355902480856000. Why can't I delete them?

Our current information on this is as follows:

Like NFS, PanFS (and, thus, DirectFLOW™ clients), use a feature known as silly renaming to preserve free-on-last-close semantics. Before discussing how DirectFLOW implements silly renaming, it might help if you understand silly renaming in the context of NFS, which is where most people initially encounter the phenomeon.

Suppose there are two processes on the same NFS client. The first process has opened a file on the NFS server and the second process wants to remove it. This causes the NFS client to rename the file temporarily to something like .nfsf797a0b20000000b until the first process is done with it, at which time the file is really removed. This preserves the freeing-on-last-close semantics. Silly renaming can happen even if only one NFS client is involved. The key idea is to unlink a file while it is still open.

Even though PanFS is not NFS, it also has this feature (no unpleasant surpise to customers). In PanFS , silly renaming is handled entirely by a DirectFLOW client, just like an NFS client does. If a DirectFLOW client wants to delete a still-opened file, the corresponding client will explicitly ask the relevant FM to rename the file to some obscure name. When the file is eventually closed, the client will send another RPC to the FM to delete the file.

A PanFS silly name begins with a dot and consists of three fields separated by two additional dots: a fixed string, panfs, the hexadecimal IP address of the client, and a timestamp (second and nanosecond). The whole idea of silly renaming is to keep the file in the name space temporarily until all existing processes are done with it. But the name is so obscure that it is unlikely that any new user will find it. In PanFS , we need silly renaming so that the client can use the name to delete the silly-renamed file when its last reference is gone.

To avoid further complicating the logic, a silly renamed file cannot be involved in any rename, delete, or link operation, either as the source or the target on that machine. On a separate DirectFLOW client, the file is treated as a regular file and can be renamed, deleted, or linked as you would any other file.

How Are the Names Created?

The names DirectFLOW uses for silly renamed files adhere to the following convention:

.panfs.<hexadecimal version of the IP address of the creator>.<creation timespec>.
				

Notice that there are three dots in a silly name. For example, if you have a file names .panfs.cc7410a.1113355902480856000, this means that a DF client with the IP address 10.65.199.12 created the silly named file at 1113355902:480856000 (Tue Apr 12 21:31:42 2005).

NOTE: panfs_trace -S translates the silly-renamed file into a properly formed IP address and timestamp.

18. How do I access the web from a shared analysis VM?

The shared scientific analysis machines (jasmin-sci[123].ceda.ac.uk, cems-sci[12].cems.rl.ac.uk) and other virtual machines within JASMIN are now configured to make outgoing HTTP requests directly, without the need for the RAL web proxy, which has now been decommissioned. Please ensure that both software configuration options and user environment variables do NOT set the http_proxy or https_proxy environment variables from now on (20/05/2016).

19. How do I clone a github repository from a shared analysis VM?

The shared analysis VMs do not allow outgoing ssh access to machines outside the RAL firewall therefore you cannot use the ssh access method to clone repositories, however you can use https:

$ git clone https://github.com/<repo-path>
				

20. How do I install Python packages myself

If you want to add extra packages to your JASMIN environment you can use virtualenv to create a customised python environment into which you can install software using pip or easy_install.

  1. Ensure your proxy settings are in place (See FAQ Q1)
  2. Run virtualenvconfigured to inherit site packages. You choose a directory into which to create your virtual environment.
    $ virtualenv --system-site-packages $PATH_TO_VENV_DIRECTORY
    				
  3. Activate the virtualenv to make it your default python. This must be done each time you login.
    $ source $PATH_TO_VENV_DIRECTORY/bin/activate
    				
  4. Install packages
    $ pip install $MYPACKAGE
    				

21. What text editors are available?

  • leafpad -- A lightweight editor with "Notepad"-style hotkeys.
  • geany -- A more sophisticated IDE. It can also be used as a simple editor, although has extra visual clutter unrelated to simple text editing.
  • nedit -- Another lightweight editor with Notepad-style hotkeys. Be aware that this seems to generate font warnings on some X displays. The fonts still appear to be usable so the warnings are cosmetic, and we do not propose to solve this problem given the availability of leafpad, but we provide nedit in case it is useful.
  • emacs, xemacs, vim-enhanced -- Editors with a long history of use on Unix / Linux systems. You will probably already know if you are interested in using one of these. Be aware that most of the key bindings differ from the Notepad ones.

22. How to acknowledge the JASMIN team

Please cite the IEEE JASMIN paper (doi:10.1109/BigData.2013.6691556) and also make sure that the words JASMIN and CEDA appear in the acknowledgements.

23. Why can't I find common Python modules on the *sci machines?

The sci machines jasmin-sci[12].ceda.ac.uk and cems-sci1.cems.rl.ac.uk have a standard software stack installed, called the JASMIN Analysis Platform. This uses Python 2.7 for all the scientific software packages. However, it runs on RedHat Enterprise 6, which uses Python 2.6 as the default version of Python required by the operating system.

As a user, this means that to use the scientific software you should explicitly invoke python using the command python2.7. Just typing python would give you the 2.6 version, and most of the scientific software modules would be unavailable.

Also, any executable Python scripts should start with:

#!/usr/bin/python2.7
				

or, to allow for possible use of the virtualenv package, with:

#!/usr/bin/env python2.7
				

Unfortunately it is not possible to override the default on a system-wide basis because scripts used by the operating system are tested with Python 2.6.

(Advanced users may override the default in their own setup by setting a shell alias. An alternative is to create a symlink called "python" pointing to /usr/bin/python2.7, and add the containing directory to $PATH; this approach will also work for any executable scripts that start with #!/usr/bin/env python. However, care would be needed to ensure that any such setup is loaded when required in environments such as the Lotus batch queues.)

User X is taking all the resources on jasmin-sciN. Please tell them to desist!

We need all users to act responsibly in their use of the shared-access scientific analysis machines ("sci VMs") jasmin-sci[123],cems-sci[23]. Within JASMIN, two modes of shared compute resource are available for general use: the sci VMs, and the LOTUS batch processing cluster.

The benefit available to users from the interactive environment provided by the sci machines is that they can test out code, make improvements and run programs interactively. The down-side is obviously that with lots of users, unless resources are closely monitored and limited per-user, at times resources will be stretched and response times will be slower than others. To police this to the level where performance is guaranteed would impose more restrictions and management overhead than we are able to offer and users would tolerate.

This is the choice you make by using the interactive environment. It should be noted however that JASMIN was not primarily designed heavy interactive (particularly graphical) computing.

If you have processing that requires larger resource usage or is more efficient to run in batch mode, you are STRONGLY encouraged to use the LOTUS cluster, ideally after testing out your processing on a smaller scale interactively. LOTUS provides a fair-share allocation system to ensure that users get a fair "slice" of available resources. The same stack of standard software is available, as is access to group workspaces and home directories. Jobs can be configured to run making use of high-memory nodes or parallel processing capabilities. But if your job is a very large, multi-way MPI job with huge memory requirement, don't be surprised when your job waits a long time in the queue for these resources to become available. Remember, JASMIN is a shared data analysis environment (a different thing from both a data archive and a supercomputer). If your compute needs exceed what can be provided (in a fair way) on JASMIN, please consider other high-performance computing resources available to NERC users, for example the ARCHER supercomputer.

In short, it's very difficult to provide free and easy access to shared interactive compute in a way that guarantees performance at any given time. We provide users with both interactive and batch computing which we hope meets most needs as well as possible. With the advent of the JAMSIN cloud, it is technically feasible for you to create your own bespoke computing environment within an overall envelope of resource available to your project. This may be another option if the generic shared resources do not meet your needs.

This website and others run by CEDA use cookies. By continuing to use this website you are agreeing to our use of cookies.