Yarn基础
1:YARN队列资源分配:
yarn.nodemanager.resource.memory-mb 58G 【说明】表示该节点上YARN可使用的物理内存总量,单位M。建议配置成节点物理内存总量的75%-90%。若该节点有其他业务的常驻进程,请降低此参数值给该进程预留足够运行资源。
yarn.nodemanager.resource.cpu-vcores 32 【说明】可分配给container的CPU核数。设置当前节点上NodeManager可使用的虚拟CPU核数,建议按节点实际逻辑核数的1.5到2倍配置。
2:YARN开源接口:
http://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html
安全集群:
kinit user
curl -k -i –negotiate -u : “https://192.168.0.108:8090/ws/v1/cluster/metrics“
curl -k -i –negotiate -u : “https://192.168.0.108:8090/ws/v1/cluster/“
curl -k -i –negotiate -u : “https://192.168.0.108:8090/ws/v1/cluster/scheduler“
非安全集群:
curl -k -i –negotiate -u : “http://192.168.0.244:8088/ws/v1/cluster/metrics“
curl -k -i –negotiate -u : “http://192.168.0.108:8088/ws/v1/cluster/“
{JSONUtil.toString(JSONUtil.path(Job.getNodeOutput(“Rest_Client_1958”),”clusterMetrics[0].appsRunning”))} > 2
3:yarn 作业大量NEW_SAVING状态
重启ZooKeeperr并不会让resourcemanager状态切换回去,重启resourcemanager才能让resourcemanager重新注册到ZooKeeperr中
4:yarn 任务大量堆积导致实例挂掉:
yarn.resourcemanager.max-completed-applications 修改10000—100 后自动老化,重新拉起实例即可, 不用删 也可以
#YARN命令
查看application
yarn application -list
yarn app -list
yarn app -list -appStates All| head 100
yarn app -list -appTypes 'Apache Flink'
查看日志:
yarn logs -applicationId
杀掉任务:
yarn application -kill application_xxx_xxx
yarn app -kill application_1633941199020_0029
for i in yarn application -list | grep -w RUNNING | awk '{print $1}' | grep application_; do yarn application -kill $i; done
yarn.resourcemanager.state-store.max-completed-applications