Skip to main content

Computing power container FAQ (frequently asked questions)

Jupyter Error reported during execution Kernel Restarting What's going on?

When Jupyter Notebook The kernel in (Kernel)collapse. When stopping or encountering errors. Jupyter Notebook Will automatically attempt to restart the kernel. During this process. All variables and data in the notebook will be cleared and need to be rerun. Possible reasons may include code errors. Memory issue. Resource limitations or other system issues. I suggest you check the syntax and logic errors in the code, andEnsure that memory usage does not exceed available capacity. If the problem persists. You can try using a higher version Python Or upgrade related dependency libraries and other operations to solve the problem.

How to find the path to execute the program?

Can be done through which Command to search:

which darknet

Return result:

/usr/local/bin/darknet

If the file is in a non-standard path. Can be done through find Command search:

find / -name program

For more detailed usage instructions, please refer to man which and man find

Why in the workspace Terminal The terminal cannot exit Vim The editing mode?

Based on user feedback. If you are in the workspace Terminal Used in terminals Vim When editing files. If encountering unconventional behaviors such as being unable to exit editing mode (Mainly manifested as entering insertion mode i Press back esc - :wq Unable to exit the current mode), Suggest checking Google Chrome Is it installed and activated in the browser Vimium, cVim etc. Vim Related plugins/extend. Some settings of such plugins are highly likely to be related to those in the workspace Terminal Shortcut key conflict.

proposal: Disable related plugins/extend. Or switch to another browser for use

Why through containers "Continue to execute" After startup, it was discovered that the original content was missing?

stay [Why do we need execution records](/docs/gear/records/#Why do we need execution records) This introduces that each container execution is in an independent environment, and [Continue execution from the specified execution record](/docs/gear/records/#Continue execution from the specified execution record) Explained it "Continue to execute" What did you do: Bind the input from the previous execution to the new execution /hyperai/home Under the directory.and "Last execution" Worth it is user clicks "Continue to execute" That one has been executed. If there is no output content during this execution. So there must be no corresponding data in the new execution. This situation may be due to clicking "Continue to execute" The execution itself was closed without being turned on. And the previous container has actual data.

For example. I created a new container cifar10 And created an execution: First execution, with Jupyter The form of workspace has been opened up. I wrote one .ipynb After the file, it was closed. One hour later. I clicked "First execution" of "Continue to execute" Open this container. In this execution "In preparation" I found out that I selected the wrong type of computing power when I was in the state. I immediately clicked close. At this point, there is a second execution in my container. And then I was "Second execution" The page has been clicked "Continue to execute" , When the container was opened. I found my /hyperai/home directory is empty.That's because the second execution was closed abnormally. There is no chance to synchronize the content of the first execution, therefore The correct approach is to "First execution" Click here "Continue to execute" Open container.

Why do I lose the content I originally installed when I reopen the container?

stay [Why do we need execution records](/docs/gear/records/#Why do we need execution records) This introduces that each container execution is in an independent environment. The manually installed dependency packages will disappear. If you want to have dependencies every time you open a container, reference resources [How to add dependencies that are not in the list](/docs/runtimes/#How to add dependencies that are not in the list) .meanwhile. We will also periodically update the image. Add more and more universal dependencies.

Why has my container been stuck "In preparation" The state of?

The possibility of a container being in a preparation state for a long time during startup may include the following::

  1. Bind a large number of files to /hyperai/home The container copies a large amount of data to during the startup phase /hyperai/home Time consuming. Especially copying a large number of small files can be very time-consuming. At the same time, reality can also be seen on the specific execution page "Data synchronization" And display the current speed of data synchronization. If there is no need to write the corresponding data. Suggest creating it as a dataset and binding it to /intput0-4 To avoid the process of copying, stay [Gear The working directory - Create a working directory as a data warehouse version](/docs/gear/output#Create a working directory as a data warehouse version) You can see how to extract from the container "working directory" Create a new dataset version.
  2. The container is in a cold start state. The machine that started the container currently does not have the image required by the container. Pulling image. Although we have added mirror nodes within the cluster to improve the time for obtaining mirrors. But in this cold start state, there will still be 3 - 5 Minutes of time spent on image pulling. At the same time, on the specific execution page, you will also see that this execution is in progress "Pull out the mirror" The state of.
  3. Network issues. Due to internal network jitter, there were issues with the container creation process. If the above two possibilities are ruled out (I haven't loaded a large amount of data myself /hyperai/home But it's over 5 minute 10 The minute container hasn't started yet), So you can try restarting the container or consult customer service in the chat window on the interface.

Why has my container been stuck "Synchronize data and close" The state of?

This situation is usually due to the same reason as slow container opening. It means binding a large number of files to /hyperai/home, When the container needs to synchronize a large amount of data when it is closed. Large amount of data synchronization, Especially the synchronization of a large number of small filesVery time-consuming. If there is no need to write the corresponding data. Suggest creating it as a dataset and binding it to /input0-4 To avoid the process of copying.

stay [Gear The working directory - Create a working directory as a data warehouse version](/docs/gear/output#Create a working directory as a data warehouse version) You can see how to extract from the container "working directory" Create a new dataset version.

If there are not many files in the container, this situation still occurs. So please consult customer service in the chat window on the interface.

How to upload, see, preview .ipynb file?

You can upload in two ways:

In the computing power container mode. You can create a new computing power container. Selection of access method working space , Then open the workspace after the container startup is complete. Upload in workspace .ipynb file

If it is in a pre trained model. You can directly upload it when uploading .ipynb Package to Zip File and upload. After uploading, you can access it in the file list of the pre trained model notebook. Later, you can also create a new container and bind it to the pre trained model. Interacting in the workspace

Storage space in containers and from "Resource utilization status" Why is the space seen in the middle different. What is their relationship?

The storage space in a container corresponds to the opening of a container "working directory" (/hyperai/home catalogue)Size of workspace, For specific computing power resources, their space is fixed, for example "T4" What is the workspace of a type of computing power resource 50GB.and "Resource utilization status" The reaction in the middle is User global storage space. The storage space of a container is used by the user when opening the container. After the container is closed, it "working directory" (/hyperai/home catalogue)The data in will be synchronized to User global storage space And occupy the corresponding quota; on the other hand. User uploaded datasets are also directly calculated User global storage space Within the limit.

When the user's global storage space exceeds the limit. Users will no longer be able to create containers. Upload dataset. Can be executed by deleting. Free up space by deleting datasets and other methods.

Why did my container get stuck. Even connected Jupyter The workspace page cannot be opened either?

Container computing resources (CPU. GPU. Memory)They are all fixed. If the program is fully occupied CPU resources. That will cause serious impacts on other processes. For example, some programs will occupy everything at once CPU resources, that Jupyter The workspace's own program will become very laggy. Even leading to Jupyter The workspace page cannot be opened either. So if you discover your own Jupyter The workspace page cannot be opened. First, check the container CPU Is it running at full capacity (It will be displayed on the container page CPU. Important indicators such as memory).

If the container is very laggy and cannot even be opened Jupyter Workspace page. You can try it use SSH The wayLogin container. If still unable to log in. You can try restarting the container.

Why mine "working space" Not automatically closed?

"working space" Automatic shutdown requires two conditions to be met:

  1. Jupyter The page has been closed
  2. CPU The usage rate continues to approach 0

be careful. If Jupyter The page is not closed in the browser. Will be perceived by the system "working space" Still being used. It won't trigger idle shutdown.