Welcome to Dis.co’s documentation!¶
Job¶
Example:
1 2 3 | my_file_id = disco.upload_file('my_file.py', pathlib.Path('/home/bob/my_file.py'))
job = disco.Job.create(my_file_id)
job.start()
|
-
class
disco.
Job
(job_id)¶ A job that runs on DISCO machines.
Every Job object has its own Python script and data files, and runs independently on the cloud, until it produces a result.
-
archive
()¶ Archive the job, making it unusable.
-
classmethod
create
(script_file_id=None, input_file_ids: Union[str, list] = None, constants_file_ids: Union[str, list] = None, job_name=None, cluster_instance_type='s', cluster_id=None, instance_cost=None, script_repo_id=None, script_file_path_in_repo=None, auto_start=False, upload_requirements_file=True, docker_image_id=None)¶ Creates a new job.
- Args:
script_file_id (str): The ID of the script file to run. input_file_ids: A list of IDs of files that will be used as standard input. constants_file_ids: A list of IDs of constants files. job_name: Is a name you can give to your job. Leave empty
to use a random string.
- cluster_instance_type: Is the size of instance used. Choose ‘m’
for a medium instance and ‘l’ for a large instance. Use gpu_s, gpu_m, gpu_l for gpu jobs (read more about gpu on disco job create -h). The default is ‘s’ for small.
- cluster_id: Specifies the ID of the cluster on which to run the
job. Leave as None to run on DISCO’s cluster.
instace_cost: instance cost type : guaranteed or lowCost. default is None. script_repo_id (str): The ID of the Git repository in which the script file to run is.
(Alternative to script_file_id).
script_file_path_in_repo (str): The path in the Git repository to the script file. auto_start: Automatically start the job upon creation. upload_requirements_file (bool): if True uploads a requirements file if in venv. docker_image_id (str): The ID of the docker image to run the job in.
- Returns:
obj: The created job object.
-
classmethod
generate_requirements
()¶ Generates a string of requirements for requirements file Returns:
separated list of requirements
-
get_details
()¶ Get details about the job.
This includes its name, last activity, status and task states.
- Returns:
JobDetails
-
get_results
(block=False, block_timeout=600)¶ Get the job’s result.
- Args:
- block (bool): Pass block=True to first wait for the job
to be completed.
block_timeout (int): timeout in seconds.
Returns:
-
get_status
()¶ Get the job’s status.
- Returns (JobStatus):
The status of the job.
-
classmethod
jobs_summary
()¶ Gets a summary of all job statuses.
- Returns:
dict: Dictionary [str, int] of status->count
-
classmethod
list_jobs
(limit=None, next_=None)¶ Show a list of all the jobs belonging to this user.
- Args:
limit (int): next_:
- Returns:
list(JobDetails)
-
start
()¶ Start the job.
When you run job.start(), the DISCO server will queue the job for execution.
- Returns:
obj: The job object.
-
stop
()¶ Cancels a running job.
When you run job.stop(), the DISCO server will stop running the job and return any results retrieved so far.
-
classmethod
upload_requirements_file
(cluster_id)¶ Uploads requirements file Args:
cluster_id: ID for cluster to upload to
- Returns:
ID of requirements file in the DB
-
wait_for_finish
(interval=5, timeout=600)¶ Wait for a job to finish.
This means waiting until it’s no longer in “Queued” or “Running” statuses.
- Args:
- interval (int): Interval in seconds to check if the job has
finished running.
timeout (int): Timeout in seconds.
- Returns:
The status of the job.
-
wait_for_status
(*expected_statuses, interval=5, timeout=600)¶ Wait for the job to be in one of the given statuses.
- Args:
*expected_statuses (str): List of expected job statuses. interval (int): Interval in seconds to check the job’s status. timeout (int): Timeout in seconds.
- Returns:
The status of the job.
-
Repository¶
DockerImage¶
Cluster¶
Asset¶
-
class
disco.
Asset
¶ Provides functionality for uploading and downloading disco files
-
input_files_from_bucket
(bucket_paths, cluster_id)¶ - Args:
bucket_paths (list(str)): cluster_id (str):
- Raises:
BucketPathsException: In case the bucket path is invalid or missing or has no files
- Returns:
list - List of files IDs registered from the given bucket paths
-
upload
(file_name, file, cluster=None, show_progress_bar=True)¶ Upload a file to DISCO, so it could later be used to run jobs.
- Args:
file_name (str): file: file can be either the file contents,
in binary or string forms, a file object, or a Path` object that points to a file.
cluster (ClusterDetails): show_progress_bar (bool):
- Returns:
str: The ID of the uploaded file.
-