Yarn基础

Yarn基础

1:YARN队列资源分配:
yarn.nodemanager.resource.memory-mb 58G 【说明】表示该节点上YARN可使用的物理内存总量,单位M。建议配置成节点物理内存总量的75%-90%。若该节点有其他业务的常驻进程,请降低此参数值给该进程预留足够运行资源。
yarn.nodemanager.resource.cpu-vcores 32 【说明】可分配给container的CPU核数。设置当前节点上NodeManager可使用的虚拟CPU核数,建议按节点实际逻辑核数的1.5到2倍配置。

2:YARN开源接口:
http://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html

安全集群:
kinit user
curl -k -i –negotiate -u : “https://192.168.0.108:8090/ws/v1/cluster/metrics
curl -k -i –negotiate -u : “https://192.168.0.108:8090/ws/v1/cluster/
curl -k -i –negotiate -u : “https://192.168.0.108:8090/ws/v1/cluster/scheduler
非安全集群:
curl -k -i –negotiate -u : “http://192.168.0.244:8088/ws/v1/cluster/metrics
curl -k -i –negotiate -u : “http://192.168.0.108:8088/ws/v1/cluster/

{JSONUtil.toString(JSONUtil.path(Job.getNodeOutput(“Rest_Client_1958”),”clusterMetrics[0].appsRunning”))} > 2

3:yarn 作业大量NEW_SAVING状态
重启ZooKeeperr并不会让resourcemanager状态切换回去,重启resourcemanager才能让resourcemanager重新注册到ZooKeeperr中

4:yarn 任务大量堆积导致实例挂掉:

yarn.resourcemanager.max-completed-applications 修改10000—100 后自动老化,重新拉起实例即可, 不用删 也可以

#YARN命令
查看application
yarn application -list
yarn app -list
yarn app -list -appStates All| head 100

yarn app -list -appTypes 'Apache Flink'

查看日志:
yarn logs -applicationId >

杀掉任务:
yarn application -kill application_xxx_xxx
yarn app -kill application_1633941199020_0029

for i in yarn application -list | grep -w RUNNING | awk '{print $1}' | grep application_; do yarn application -kill $i; done

yarn.resourcemanager.state-store.max-completed-applications