Cluster usage
Problem: ssh cannot log in to the cluster, and the error WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!
Answer: 1> The change of the host key of the login node is different from the login node key saved by the user before, and the SSH client will prevent this login. The correct way is to edit the ~/.ssh/known_hosts file and delete the one corresponding to the login node. One line and save and exit, then reconnect, choose to trust the new host key.
2> This is most likely a man-in-the-middle attack. So ssh will prompt the above warning, telling us that the server's key is different from the one stored before, which is relatively rare
Problem: Running a graphical program reports an error
Answer: The cluster supports graphics program jump
1> Check whether the terminal you use supports graphic jumping, or test running xclock directly on the login node to see if the clock icon can jump out, if you can’t jump, there is a high probability that your terminal does not support graphic jumping, or the plug-in is not installed
2> If you can submit the cluster test again srun –p q_cn --x11 --pty xlcock If not, you can consider the problem from the following two aspects
Check if the home space is full, you can make a soft link to the home under DATA, so that it will not occupy the space under the home
Otherwise regenerate the local key (1>ssh-keygen 2> cat id_dsa.pub > authorized_keys 3>chmod 600 authorized_keys )
Problem: Unable to write locally, error Disk quota exceeded
answer:1> Check the lab storage space DATA and scratch60
View group usage mmlsquota -g `groups` (default DATA 2T+scratch60 10T)
View the usage of the DATA directory mmlsquota -j `groups`_permanent gpfs
View scratch60 directory usage mmlsquota -j `groups`_temp gpfs
Insufficient storage space needs to clean up unused data or apply for a larger storage space
2> View the space under home
cd $HOME && du -sh && du -sh .[!.]*
Viewing a relatively large directory can create a soft link so that it will not take up home space
mv $HOME/.local DATA/
ln -sf /GPFS/zhangli_lab_permanent/wangyanmin/.local /home/zhangli_lab/wangyanmin/
Problem: Submitted tasks stop automatically
Answer: 1>The program reports an error/memory overflow
2> If the time limit is exceeded, the cluster task submission default is 2 days, and there will be an email reminder about 3 hours before the end of the task. For tasks that need to be extended, you need to contact the administrator of the computing center to extend the time
Problem: Installing software requires administrative strict permissions
Solution: Root or sudo privileges are not given to the user under any circumstances.
If you want to perform an operation like sudo yum install, most of the system's dependent packages have already been installed.
If you want to execute sudo apt install, the computing center cluster uses CentOS instead of Debian/Ubuntu, and the package you want to install may be called another name yum in CentOS. Also because most of the dependencies are already installed, you just need to skip this step.
If your subsequent installation and use steps indicate that the dependent package is indeed missing, please tell the computer center system administrator the name of the dependent package and let the administrator install it.
Another reason why users need root (sudo) permissions is that the default installation path (usually /opt, /usr/local, which requires root permissions) is not modified when installing the software. To solve this kind of problem, if you compile the software from source code, you usually use the --prefix= option to specify the installation directory to the user's own directory when configuring. If it is an installation-type software, modify the default installation directory to the user's own directory in the installation wizard.
© 2023 by Personal Life Coach. Proudly created with Wix.com ICP备案号:京ICP备18029179号