A Geodyssey – Enterprise Search Discovery, Text Mining, Machine Learning 02月17日
Combining minerals and lithology text embeddings for data discovery
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

作者在3D t-SME图中结合地质报告的文本嵌入,探索岩性、矿物间的相似性,发现了一些有趣关联并正在研究。还复现了他人报告中的矿物关联,下一步将开发算法检测关联实体,结合数据驱动和地质模型技术。

🧐在3D t-SME图中结合地质报告文本嵌入

🎉探索岩性、矿物间的相似性及关联

💻复现他人报告中的矿物关联并研究

🎯开发算法检测关联实体,结合两种技术

I’ve combined text embeddings generated from word co-occurrences within thousands of geological reports for both lithology and minerals in a 3D t-SME plot. Following on from some recent posts I made, it may be interesting to explore similarity (cosine vector similarity) between lithologies-lithologies, minerals-minerals and lithologies-minerals.

This is a technique anyone can conduct on large volumes of reports. I’ve spotted some potentially interesting associations which I’m currently researching.

I have also replicated the mineral-mineral association of Scheelite-Molybdenite, reported by Lawley et al (2022) from Natural Resources Canada (NRC) in their paper “Geoscience language models and their intrinsic evaluation”, using a completely different collection of reports.

They state:

“Word embeddings provide a powerful framework for evaluating and predicting mineral groups based on thousands of observations in nature from multiple trained observers over time. Minerals from disparate classification groups that plot close together provide intriguing evidence for associations that require re-examination (e.g., the lesser known association between scheelite and molybdenite in porphyry-skarn mineral systems).”

My next step is developing an algorithm that automatically looks at such data driven arrays, to detect candidates for closely associated entities that may not be well known. This would include adding some geological model knowledge into the algorithm with some rules. For example, it would discount ‘obvious’ associations from and within different types where the geological structuration, depositional or diagenetic/mineralization mechanism is well known e.g. clays-marls, copper-zinc, chalcopyrite-galena, black shale-pyrite etc. So combining data driven and geological model techniques.

As I mentioned in my previous post, I don’t have a specific question in my mind. I’m just inductively exploring the data visualisation generated from millions of sentences, more than I could ever read, and see what may catch the eye.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

地质报告 文本嵌入 相似性探索 关联检测
相关文章