admin管理员组

文章数量:1530518

世界上并没有完美的程序,但是我们并不因此而沮丧,因为写程序就是一个不断追求完美的过程。-侯氏工坊 [toc]

  • 原理:前景频率与背景频率比较

significant_text

from elasticsearch import Elasticsearch
import urllib3

urllib3.disable_warnings()

# PUT es_significant_text
# {
#   "mappings": {
#     "properties": {
#       "name": {"type": "text"},
#       "type": {"type": "keyword"}
#     }
#   }
# }

# POST es_significant_text/_bulk
# {"index": {"_id": 1}}
# {"name": "es hello good", "type": "lan"}
# {"index": {"_id": 2}}
# {"name": "good ttt ml", "type": "lan"}
# {"index": {"_id": 3}}
# {"name": "es kkk ksdl", "type": "lan"}
# {"index": {"_id": 4}}
# {"name": "elastic title", "type": "lan"}
# {"index": {"_id": 5}}
# {"name": "es jnlsjdin", "type": "te"}
# {"index": {"_id": 6}}
# {"name": "good dsfsd", "type": "te"}

# GET es_significant_text/_search
# {
#   "query": {"term": {
#     "type": {
#       "value": "te"
#     }
#   }},
#   "size": 0,
#   "aggs": {
#     "my_significant_text": {
#       "significant_text": {
#         "field": "name",
#         "min_doc_count": 1
#       }
#     }
#   }
# }

# 创建es实例
es = Elasticsearch("https://192.168.2.64:9200",
				   verify_certs=False,
				   basic_auth=("elastic", "MuZkDqdW--VsfDjTcoex"),
				   request_timeout=60,
				   max_retries=3,
				   retry_on_timeout=True,
				   node_selector_class="round_robin")

# 刷新
es.indices.refresh(index="es_significant_text")

significant_text =  {
    "my_significant_text": {
      "significant_text": {
        "field": "name",
        "min_doc_count": 1
      }
    }
  }

query = {"term": {
    "type": {
      "value": "te"
    }
  }}

resp = es.search(index="es_significant_text", size=0, query=query, aggregations=significant_text)

print(resp['aggregations']['my_significant_text']['buckets'])

本文标签: 文本异常Pythonessignificanttext