01 LSF란
1. 정의
- 'Load Sharing Facility'의 약자
- IBM에서 만든 소프트웨어 제품으로 간략하게 말해 job scheduling 프로그램 (workload management platform)
- 빠르고 믿을만한 workload performace를 실행하고 비용 또한 절감되는 shared, scalable, and fault-tolerant 인프라를 생성하기 위해 다양한 IT 자원(resource)에 작업을 분배 (distribute jobs)함
- 부하(load)를 균형있게 분산하고 자원을 할당하며 해당 자원에 접근할 수 있는 기능을 제공
- provides a resource management framework that takes your job requirements, finds the best resources to run the job, and monitors its progress
- 작업 요구 사항을 고려하여 작업을 실행하기에 최적의 자원을 찾고 작업의 진행 상황을 모니터링하는 자원 관리 프레임워크를 제공
- (Job) 작업은 항상 호스트 부하(host load)와 사이트 정책에 따라 실행됨
02 LSF Cluster란
1. Cluster의 정의
- A group of computers (hosts) running LSF that work together as a single unit, combining computing power, workload, and resources. A cluster provides a single-system image for a network of computing resources.
- Hosts can be grouped into a cluster in a number of ways. A cluster can contain:
- All the hosts in a single administrative group
- All the hosts on a subnetwork
2. Hosts
- Hosts in your cluster perform different functions
1) Management host
- LSF server host that acts as the overall coordinator for the cluster, doing all job scheduling and dispatch
2) Server host
- a host that submits and runs jobs
3) Client host
- a host that only submits jobs and tasks
4) Execution host
- a host that runs jobs and tasks
5) Submission host
- a host from which jobs and tasks are submitted
03 Job이란
- A unit of work that is running in the LSF system.
- A job is a command that is submitted to LSF for execution.
- LSF schedules, controls, and tracks the job according to configured policies.
- Jobs can be complex problems, simulation scenarios, extensive calculations, or anything that needs compute power.
04 Job slot이란
- A job slot is a bucket into which a single unit of work is assigned in the LSF system.
- Hosts can be configured with multiple job slots and you can dispatch jobs from queues until all the job slots are filled.
- You can correlate job slots with the total number of CPUs in the cluster.
05 Queue란
- A cluster-wide container for jobs.
- All jobs wait in queues until they are scheduled and dispatched to hosts.
- Queues do not correspond to individual hosts; each queue can use all server hosts in the cluster, or a configured subset of the server hosts.
- When you submit a job to a queue, you do not need to specify an execution host.
- LSF dispatches the job to the best available execution host in the cluster to run that job.
- Queues implement different job scheduling and control policies.
06 자원(Resources)이란
- Resources are the objects in your cluster that are available to run work.
- 예시) include but are not limited to hosts, CPU slots, and licenses
참조
'소프트웨어 > LSF' 카테고리의 다른 글
lsfstartup, lsfrestart, lsfshutdown (LSF 명령어) (0) | 2024.04.29 |
---|---|
LSF 빠른 참조 (0) | 2024.04.22 |
LSF 클러스터, 잡, 큐 (0) | 2024.04.22 |
LSF 데몬 (0) | 2024.04.16 |
LSF 명령어 (0) | 2024.04.16 |