admin管理员组文章数量:1622638
文章目录
- 视觉词袋模型
- Java实现
本文作者:合肥工业大学 管理学院 qianyang email:1563178220@qq 内容可能有不到之处,欢迎交流。
未经本人允许,禁止转载。
视觉词袋模型
构建视觉词典的一般思路是:(1)抽取所有图片的关键点;(2)对所有关键点进行聚类;(3)针对每张图片每个关键点对应的簇标签,进行统计(数数),获取频率,进而构建视觉词典。
抽取关键点主要使用的算法是SIFT。该算法是深度学习之前做视觉处理的一种经典算法,对应的论文为:
Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International journal of computer vision, 2004, 60(2): 91-110.
Lowe D G. Object recognition from local scale-invariant features[C]//Proceedings of the seventh IEEE international conference on computer vision. Ieee, 1999, 2: 1150-1157.
看下图论文的引用次数,可想其影响力。
对关键点特征进行聚类,常使用的方法是K-means方法。
Java实现
下图将以Java下面的框架Openimaj抽取图片对应的关键点以及视觉词袋。
首先,使用构建maven工程,并在POM文件中,配置相关jar包:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
<version>3.6.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.openimaj/image-feature-extraction -->
<dependency>
<groupId>org.openimaj</groupId>
<artifactId>image-feature-extraction</artifactId>
<version>1.3.10</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.openimaj/image-local-features -->
<dependency>
<groupId>org.openimaj</groupId>
<artifactId>image-local-features</artifactId>
<version>1.3.10</version>
</dependency>
其次,输入的目录为:
具体对应的图片如下:
Java实现代码如下:
package BagVWord.BagVWord;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import org.openimaj.data.DataSource;
import org.openimaj.feature.SparseIntFV;
import org.openimaj.feature.local.data.LocalFeatureListDataSource;
import org.openimaj.feature.local.filter.ByteEntropyFilter;
import org.openimaj.feature.local.list.LocalFeatureList;
import org.openimaj.feature.local.quantised.QuantisedLocalFeature;
import org.openimaj.image.FImage;
import org.openimaj.image.ImageUtilities;
import org.openimaj.image.MBFImage;
import org.openimaj.image.colour.RGBColour;
import org.openimaj.image.feature.local.aggregate.BagOfVisualWords;
import org.openimaj.image.feature.local.engine.DoGSIFTEngine;
import org.openimaj.image.feature.local.keypoints.Keypoint;
import org.openimaj.image.feature.local.keypoints.KeypointLocation;
import org.openimaj.image.feature.local.keypoints.KeypointVisualizer;
import org.openimaj.math.statistics.distribution.Histogram;
import org.openimaj.ml.clustering.ByteCentroidsResult;
import org.openimaj.ml.clustering.assignment.HardAssigner;
import org.openimaj.ml.clustering.kmeans.ByteKMeans;
import org.openimaj.util.filter.FilterUtils;
public class BoVWExtr {
public static void main(String[] args) throws Exception {
//读取数据
final List<String> fileList = readDirs("image/");
//初始化,获取SIFT特征
final DoGSIFTEngine engine = new DoGSIFTEngine();
//识别局部特征集合
final Map<String, LocalFeatureList<Keypoint>> imageKeypoints = new HashMap<String, LocalFeatureList<Keypoint>>();
//加载每一张图片,抽取关键点
for (final String file : fileList) {
final FImage image = ImageUtilities.readF(new File(file));
imageKeypoints.put(file, engine.findFeatures(image));
}
//使用K-means聚类 将128维的关键点,聚类成50个簇
final ByteKMeans kmeans = ByteKMeans.createKDTreeEnsemble(50);
//datasource的数量为所有关键点的数量,关键点的特征进行聚类
final DataSource<byte[]> datasource = new LocalFeatureListDataSource<Keypoint, byte[]>(imageKeypoints);
//执行聚类
final ByteCentroidsResult result = kmeans.cluster(datasource);
System.out.println("========result");
System.out.println(result.getCentroids()[5][1]);
byte[][] centroids = result.getCentroids();
System.out.println(Arrays.toString(datasource.getData(0)));
System.out.println(centroids.length);
//创建特征分配
final HardAssigner<byte[], ?, ?> assigner = result.defaultHardAssigner();
//获取特征的频率
final BagOfVisualWords<byte[]> bovw = new BagOfVisualWords<byte[]>(assigner);
final Map<String, SparseIntFV> occurrences = new HashMap<String, SparseIntFV>();
for (final Entry<String, LocalFeatureList<Keypoint>> entry : imageKeypoints.entrySet()) {
occurrences.put(entry.getKey(), bovw.aggregate(entry.getValue()));
}
//输出视觉词袋
for (String file : fileList) {
List<QuantisedLocalFeature<KeypointLocation>> quantisedFeatures = BagOfVisualWords
.computeQuantisedFeatures(assigner, imageKeypoints.get(file));
//Create the visual word ocurrence histogram 创造视觉词频
SparseIntFV features = BagOfVisualWords.extractFeatureFromQuantised(quantisedFeatures, 200);
//Set shot feature histogram for use in intershot distance
Histogram featureHistogram = new Histogram(features.asDoubleVector());
featureHistogram = new Histogram(featureHistogram.normaliseFV());
//输出视觉词频
for(int i = 0; i < features.length(); i++){
System.out.print(features.getVector().get(i) + "\t");
}
System.out.println();
//关键点抽取可视化--并输出结果
List<Keypoint> keys1f = FilterUtils.filter(imageKeypoints.get(file), new ByteEntropyFilter());
KeypointVisualizer<Float[], MBFImage> viz = new KeypointVisualizer<Float[], MBFImage>(ImageUtilities.readMBF(new File(file)), keys1f);
MBFImage outimg = viz.drawPatches(RGBColour.GREEN, null);
ImageUtilities.write(outimg, new File("output/" + new File(file).getName()));
}
//每一位特征表示什么 (每个图片的每个关键点代表什么--这里展示了第40个簇对应的每张图片的关键点)
for (final Entry<String, LocalFeatureList<Keypoint>> entry : imageKeypoints.entrySet()) {
LocalFeatureList<Keypoint> features = entry.getValue();
List<Keypoint> keyOFValue1 = new ArrayList<Keypoint>();
for (final Keypoint f : features) {
final int idx = assigner.assign(f.getFeatureVector().values);
if (idx == 40) {
keyOFValue1.add(f);
}
}
KeypointVisualizer<Float[], MBFImage> viz = new KeypointVisualizer<Float[], MBFImage>(ImageUtilities.readMBF(new File(entry.getKey())), keyOFValue1);
MBFImage outimg = viz.drawPatches(RGBColour.GREEN, null);
ImageUtilities.write(outimg, new File("cluster/" + new File(entry.getKey()).getName()));
}
}
//读取某文件目录下的所有文件
public static List<String> readDirs(String filepath) throws FileNotFoundException, IOException {
ArrayList<String> FileList = new ArrayList<String>();
try
{
File file = new File(filepath);
if(!file.isDirectory())
{
System.out.println("输入的[]");
System.out.println("filepath:" + file.getAbsolutePath());
}
else
{
String[] flist = file.list();
for(int i = 0; i < flist.length; i++)
{
File newfile = new File(filepath + "\\" + flist[i]);
if(!newfile.isDirectory())
{
FileList.add(newfile.getAbsolutePath());
}
else if(newfile.isDirectory()) //if file is a directory, call ReadDirs
{
readDirs(filepath + "\\" + flist[i]);
}
}
}
}catch(FileNotFoundException e)
{
System.out.println(e.getMessage());
}
return FileList;
}
}
上述代码中,将关键点抽取的结果保存到了output/文件夹下。如下图所示的图片:
具体某一张图片展示如下:
另外,将每个簇对应的关键点,保存到了cluster/目录下。
此外,我将得到的词袋输出到了控制台,结果如下:
版权声明:本文标题:Java使用Openimaj构建视觉词袋模型 内容由热心网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:https://m.elefans.com/dianzi/1728871525a1177327.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论