我有一个大文档文件(~1.5 gb~430.000行)。程序逐行读取文档文件并将每一行插入rethinkdb。但是我在11000-12000行之间收到这个错误。原因是什么?
RethinkDBConnectionFactory.java
public class RethinkDBConnectionFactory {
private final RethinkDB r = RethinkDB.r;
private String host;
public RethinkDBConnectionFactory(String host) {
this.host = host;
}
public Connection createConnection() {
return r.connection().hostname(host).connect();
}
public String getHost() {
return host;
}
public void setHost(String host) {
this.host = host;
}
}
InserterService
@Service
public class InserterService {
protected final Logger log = LoggerFactory.getLogger(InserterService.class);
private final RethinkDB r = RethinkDB.r;
@Autowired
private RethinkDBConnectionFactory connectionFactory;
private ObjectMapper oMapper = new ObjectMapper();
@SuppressWarnings("unchecked")
public void insertData(Activity activity) {
oMapper.setSerializationInclusion(Include.NON_NULL);
Map<Object, Object> map = oMapper.convertValue(activity, Map.class);
r.db("test").table("twitter").insert(map).run(connectionFactory.createConnection());
}
}
重新思考DB配置
@Configuration
@ComponentScan(basePackages = { "com.erdem.rethinkdb.inserter.*" })
@PropertySource("classpath:application.yml")
public class RethinkDBConfiguration {
@Value("${rethinkdb.host}")
private String DBHOST;
@Bean
public RethinkDBConnectionFactory connectionFactory() {
return new RethinkDBConnectionFactory(DBHOST);
}
@Bean
public DbInitializer dbInitializer() {
return new DbInitializer();
}
}
我不知道文件是在哪里或如何读取的;我认为错误就在那里。您可以随时尝试使用Eclipse MAT和-XX: HeapDumpOnOutOfMemoryError来获取heapdump,然后检查出了什么问题。
尝试每行插入这些对象,此时您正在映射430k行并将它们保存在内存中,这可能会在将它们插入DB时导致问题
或者增加JVM内存分配