admin管理员组

文章数量:1622638

文章目录

  • 视觉词袋模型
  • Java实现

本文作者:合肥工业大学 管理学院 qianyang email:1563178220@qq 内容可能有不到之处,欢迎交流。
未经本人允许,禁止转载。

视觉词袋模型

构建视觉词典的一般思路是:(1)抽取所有图片的关键点;(2)对所有关键点进行聚类;(3)针对每张图片每个关键点对应的簇标签,进行统计(数数),获取频率,进而构建视觉词典。

抽取关键点主要使用的算法是SIFT。该算法是深度学习之前做视觉处理的一种经典算法,对应的论文为:
Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International journal of computer vision, 2004, 60(2): 91-110.

Lowe D G. Object recognition from local scale-invariant features[C]//Proceedings of the seventh IEEE international conference on computer vision. Ieee, 1999, 2: 1150-1157.

看下图论文的引用次数,可想其影响力。

对关键点特征进行聚类,常使用的方法是K-means方法。

Java实现

下图将以Java下面的框架Openimaj抽取图片对应的关键点以及视觉词袋。

首先,使用构建maven工程,并在POM文件中,配置相关jar包:

	<dependency>
			<groupId>org.apache.commons</groupId>
			<artifactId>commons-math3</artifactId>
			<version>3.6.1</version>
		</dependency>
		<!-- https://mvnrepository.com/artifact/org.openimaj/image-feature-extraction -->
		<dependency>
			<groupId>org.openimaj</groupId>
			<artifactId>image-feature-extraction</artifactId>
			<version>1.3.10</version>
		</dependency>
		<!-- https://mvnrepository.com/artifact/org.openimaj/image-local-features -->
		<dependency>
			<groupId>org.openimaj</groupId>
			<artifactId>image-local-features</artifactId>
			<version>1.3.10</version>
		</dependency>

其次,输入的目录为:


具体对应的图片如下:

Java实现代码如下:

package BagVWord.BagVWord;

import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import org.openimaj.data.DataSource;
import org.openimaj.feature.SparseIntFV;
import org.openimaj.feature.local.data.LocalFeatureListDataSource;
import org.openimaj.feature.local.filter.ByteEntropyFilter;
import org.openimaj.feature.local.list.LocalFeatureList;
import org.openimaj.feature.local.quantised.QuantisedLocalFeature;
import org.openimaj.image.FImage;
import org.openimaj.image.ImageUtilities;
import org.openimaj.image.MBFImage;
import org.openimaj.image.colour.RGBColour;
import org.openimaj.image.feature.local.aggregate.BagOfVisualWords;
import org.openimaj.image.feature.local.engine.DoGSIFTEngine;
import org.openimaj.image.feature.local.keypoints.Keypoint;
import org.openimaj.image.feature.local.keypoints.KeypointLocation;
import org.openimaj.image.feature.local.keypoints.KeypointVisualizer;
import org.openimaj.math.statistics.distribution.Histogram;
import org.openimaj.ml.clustering.ByteCentroidsResult;
import org.openimaj.ml.clustering.assignment.HardAssigner;
import org.openimaj.ml.clustering.kmeans.ByteKMeans;
import org.openimaj.util.filter.FilterUtils;
public class BoVWExtr {
	public static void main(String[] args) throws Exception {
		//读取数据
		final List<String> fileList = readDirs("image/"); 
		//初始化,获取SIFT特征
		final DoGSIFTEngine engine = new DoGSIFTEngine();
		//识别局部特征集合
		final Map<String, LocalFeatureList<Keypoint>> imageKeypoints = new HashMap<String, LocalFeatureList<Keypoint>>();
		//加载每一张图片,抽取关键点
		for (final String file : fileList) {
			final FImage image = ImageUtilities.readF(new File(file));
			imageKeypoints.put(file, engine.findFeatures(image));
		}
		//使用K-means聚类 将128维的关键点,聚类成50个簇
		final ByteKMeans kmeans = ByteKMeans.createKDTreeEnsemble(50);
		//datasource的数量为所有关键点的数量,关键点的特征进行聚类
		final DataSource<byte[]> datasource = new LocalFeatureListDataSource<Keypoint, byte[]>(imageKeypoints);
		//执行聚类
		final ByteCentroidsResult result = kmeans.cluster(datasource);
		System.out.println("========result");
		System.out.println(result.getCentroids()[5][1]);
		byte[][] centroids = result.getCentroids();
		System.out.println(Arrays.toString(datasource.getData(0)));
		System.out.println(centroids.length);
		//创建特征分配
		final HardAssigner<byte[], ?, ?> assigner = result.defaultHardAssigner();
		//获取特征的频率
		final BagOfVisualWords<byte[]> bovw = new BagOfVisualWords<byte[]>(assigner);
		final Map<String, SparseIntFV> occurrences = new HashMap<String, SparseIntFV>();
		for (final Entry<String, LocalFeatureList<Keypoint>> entry : imageKeypoints.entrySet()) {
			occurrences.put(entry.getKey(), bovw.aggregate(entry.getValue()));
		}
		//输出视觉词袋
		for (String file : fileList) {
			List<QuantisedLocalFeature<KeypointLocation>> quantisedFeatures = BagOfVisualWords
					.computeQuantisedFeatures(assigner, imageKeypoints.get(file));
			//Create the visual word ocurrence histogram 创造视觉词频
			SparseIntFV features = BagOfVisualWords.extractFeatureFromQuantised(quantisedFeatures, 200);
			//Set shot feature histogram for use in intershot distance
			Histogram featureHistogram = new Histogram(features.asDoubleVector());
			featureHistogram = new Histogram(featureHistogram.normaliseFV());
			//输出视觉词频
			for(int i = 0; i < features.length(); i++){
				System.out.print(features.getVector().get(i) + "\t");
			}
			System.out.println();
			//关键点抽取可视化--并输出结果
			List<Keypoint> keys1f = FilterUtils.filter(imageKeypoints.get(file), new ByteEntropyFilter());
			KeypointVisualizer<Float[], MBFImage> viz = new KeypointVisualizer<Float[], MBFImage>(ImageUtilities.readMBF(new File(file)), keys1f);
			MBFImage outimg = viz.drawPatches(RGBColour.GREEN, null);
			ImageUtilities.write(outimg, new File("output/" + new File(file).getName()));

		}
		//每一位特征表示什么 (每个图片的每个关键点代表什么--这里展示了第40个簇对应的每张图片的关键点)
		for (final Entry<String, LocalFeatureList<Keypoint>> entry : imageKeypoints.entrySet()) {
			LocalFeatureList<Keypoint> features = entry.getValue();
			List<Keypoint> keyOFValue1 = new ArrayList<Keypoint>(); 
			for (final Keypoint f : features) {
				final int idx = assigner.assign(f.getFeatureVector().values);
				if (idx == 40) {
					keyOFValue1.add(f);
				}
			}
			KeypointVisualizer<Float[], MBFImage> viz = new KeypointVisualizer<Float[], MBFImage>(ImageUtilities.readMBF(new File(entry.getKey())), keyOFValue1);
			MBFImage outimg = viz.drawPatches(RGBColour.GREEN, null);
			ImageUtilities.write(outimg, new File("cluster/" + new File(entry.getKey()).getName()));
		}
	}
	//读取某文件目录下的所有文件
	public static List<String> readDirs(String filepath) throws FileNotFoundException, IOException {
		ArrayList<String> FileList = new ArrayList<String>(); 
		try
		{
			File file = new File(filepath);
			if(!file.isDirectory())
			{
				System.out.println("输入的[]");
				System.out.println("filepath:" + file.getAbsolutePath());
			}
			else
			{
				String[] flist = file.list();
				for(int i = 0; i < flist.length; i++)
				{
					File newfile = new File(filepath + "\\" + flist[i]);
					if(!newfile.isDirectory())
					{
						FileList.add(newfile.getAbsolutePath());
					}
					else if(newfile.isDirectory()) //if file is a directory, call ReadDirs
					{
						readDirs(filepath + "\\" + flist[i]);
					}                    
				}
			}
		}catch(FileNotFoundException e)
		{
			System.out.println(e.getMessage());
		}
		return FileList;
	}
}

上述代码中,将关键点抽取的结果保存到了output/文件夹下。如下图所示的图片:

具体某一张图片展示如下:

另外,将每个簇对应的关键点,保存到了cluster/目录下。



此外,我将得到的词袋输出到了控制台,结果如下:

本文标签: 模型视觉JavaOpenimaj