提问者:小点点

如何在Docker容器中定位openjdk?


我试着运行pyspark应用程序。为此,我首先从pip安装pyspark,然后打开openjdk:8来设置JAVA_HOME变量

Dockerfile:

FROM python:3

ADD my_script.py /
COPY requirements.txt ./

ENV JAVA_HOME  /usr/lib/jvm/java-8-openjdk-amd64/

RUN pip install --no-cache-dir -r requirements.txt

CMD [ "python", "./my_script.py" ]

my_script。副本:

from pyspark import SparkContext
from pyspark import SparkConf

#spark conf
conf1 = SparkConf()
conf1.setMaster("local[*]")
conf1.setAppName('hamza')
print(conf1)
sc = SparkContext(conf = conf1)


from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
print(sqlContext)

要求。文本:

pyspark

努比

出现此错误:

C:\Users\hrafiq\Desktop\sample>docker run -it --rm --name data2 my-python-app
<pyspark.conf.SparkConf object at 0x7f4bd933ba58>
/usr/local/lib/python3.7/site-packages/pyspark/bin/spark-class: line 71: 
/usr/lib/jvm/java-8-openjdk-amd64//bin/java: No such file or directory
Traceback (most recent call last):
   File "./my_script.py", line 14, in <module>
    sc = SparkContext(conf = conf1)
  File "/usr/local/lib/python3.7/site-packages/pyspark/context.py", line 115, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "/usr/local/lib/python3.7/site-packages/pyspark/context.py", line 298, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "/usr/local/lib/python3.7/site-packages/pyspark/java_gateway.py", line 94, in launch_gateway
    raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number  

所以问题是,如果没有找到java文件,那么我将如何找到该文件?我知道它存储在某个我们无法访问的虚拟硬盘中。

任何帮助都将不胜感激

谢谢


共1个答案

匿名用户

设置JAVA_HOMEenv var是不够的。您需要在docker映像中实际安装openjdk。

您的基本映像(python:3)本身基于Debian拉伸映像。因此,您可以使用apt-get install来获取JDK:

FROM python:3

RUN apt-get update && \
     apt-get install -y openjdk-8-jdk-headless && \
    rm -rf /var/lib/apt/lists/*
ENV JAVA_HOME  /usr/lib/jvm/java-8-openjdk-amd64/

COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

COPY my_script.py ./
CMD [ "python", "./my_script.py" ]

(在上面,我优化了层排序,这样当只有脚本更改时,您就不需要重新构建 pip 安装层)