当前位置:网站首页>poj1521
poj1521
2022-07-26 01:23:00 【zjsru_ Beginner】
Before solving this problem, you need to master two knowledge points .
1. No prefix encoding :
That is, the code of any data cannot be the prefix of other data codes . for instance , take A Encoded as 0,B Make up for 01 It can be seen at this time A Just add it back 1 Namely B, therefore A by B The prefix of . Prefix free coding is the opposite of the above example , Take the same example ,A Encoded as 1,B Encoded as 01, You can see here A yes B No prefix encoding .
The role of prefix free coding , It is mainly used in unequal length coding , for instance , We will AAAABBAA This data is encoded , If you code by a fixed length , Each letter accounts for 8 Seats, then a total of 64 position , If we allocate the coding length according to the size of the frequency , For example, there will be a maximum number of A Encoded as 1,B Encoded as 01 In this way, the whole segment of data only needs to occupy 10 position , As for why to use prefixed encoding , It is to ensure that there is no reading error when reading the code , for instance , We make A by 0,B by 01,C by 001, When AB When two letters are put together , We can't tell if he is AB Two letters or a single letter C, So when we use unequal length coding , We need to use prefix free coding to ensure that we can read the data segment correctly .
2. Huffman tree :
In fact, the unequal length and prefixed code we mentioned earlier can also be called Huffman code , The Huffman tree is to build a Huffman tree with the frequency of characters as the weight , The result is that every character is prefixed , Thus reducing the length of the code , To improve the compression ratio .
After understanding the above two knowledge points , In fact, I roughly understand the meaning of the topic
I use two examples to explain the general meaning of the topic
Example 1: To text AAAAABCD Encoding , If you use 8 Bit ASCII Coding requires 64 position , In Huffman code , We make A by 0、B by 10、C by 110、D by 111, Using this encoding only requires 13 by , Compare the two codes as 4.9:1.
Example 2: To text THE CAT IN THE HAT. Encoding , It should be noted that spaces should also be included . Use 8 Bit encoding requires 144 position , Using Huffman coding, we make the space as 00、A by 100、C by 1110、E by 1111、H by 110、I by 1010、N by 1011、T by 01. This kind of coding only needs 51 by . The comparison between the two is 2.8:1.
The requirement of the topic is to give a string 、 Then we need to output them separately 8 For the length of the code and the length of the Huffman code and the ratio of the two .
Input
The input file will contain a list of text strings , Each row of a . The text string will contain only uppercase alphanumeric characters and underscores ( Used in place of spaces ). The end of the input will consist of only words “END” The line as a text string signals . This line should not be handled .
Output
For each text string in the input , Output 8 position ASCII Encoded bit length 、 The bit length of the optimal prefixed variable length coding and the compression rate accurate to the decimal point .
The sample input
AAAAABCD
THE_CAT_IN_THE_HAT
END
Sample output
64 13 4.9
144 51 2.8
The code is as follows :
#include <iostream>
#include <cstring>
#include <queue>
#include <string>
using namespace std;
int a[30];
int main()
{
string s;
while(1)
{
cin >> s;
if(s=="END") break;
memset(a,0,sizeof(a));// You need to format the data before each input
int n=s.size();
for(int i=0;i<n;i++)
{
if(s[i]=='_') a[26]++;// Judge whether it is a space
else a[s[i]-'A']++;
}
priority_queue<int,vector<int>,greater<int> >q;// Ascending priority queue , Ensure that the first element of the team is the minimum
for(int i=0;i<=26;i++)
{
if(a[i]) q.push(a[i]);// input data
}
int ans=n;
while(q.size()>2)// Simulate the operation of building Huffman tree , Specific Baidu
{
int t,t1,t2;
t1=q.top();q.pop();// Its construction method is to select two minimum merges at a time
t2=q.top();q.pop();// Until it merges into a tree
t=t1+t2;
ans+=t;
q.push(t);
}
printf("%d %d %.1lf\n",n*8,ans,(double)n*8/ans);
}
return 0;
}big data 201 ly
边栏推荐
- MDK compilation process and arm compilation tool chain
- [secsha concept] large and small end
- 谷歌浏览器调试工具使用基础版(一)
- 换ip软件的用途很广及原理 动态IP更换的四种方法来保护网络隐私
- How does the proxy IP server ensure its information security in the network
- 1.30 升级bin文件添加后缀及文件长度
- Ideal Path(UVA - 1599)
- Pycharm automatically adds header comments when creating py files
- [software development specification iv] application system security coding specification
- [Code] refers to the repeated number in the offer 03 array
猜你喜欢

Format JS code and debug JS code

FreeBSD bnxt以太网驱动源码阅读记录二:

1.30 升级bin文件添加后缀及文件长度

Handler消息机制-FWK层
![[ickim 2022] the Fourth International Conference on knowledge and information management](/img/e0/1d7aebc6a6fe42c4f4ddd47a2045ec.jpg)
[ickim 2022] the Fourth International Conference on knowledge and information management

Arthas watch 命令查看数组中对象的属性

“元气可乐”不是终点,“中国可乐”才是

腾讯员工晒出薪资:真实985毕业薪资,大家看我还有救吗?网友:日薪?

NiO simple example

U++ learning notes ustruct, uenum declaration and function library simple function implementation
随机推荐
MulDA: A Multilingual Data Augmentation Framework for Low-Resource Cross-Lingual NER 阅读笔记
EasyRecovery15下载量高的恢复率高的数据恢复软件
What is informatization? What is digitalization? What are the connections and differences between the two?
网络性能评估工具 ping/mtr
RHCE之at和crontab命令详解及chrony部署
poj1521
Google Gson用法详解
Mulda: a multilingual data augmentation framework for low resource cross linguistic ner reading notes
电视机软件烧录
Codeforces Round #810 (Div. 2)A~C
机器学习:贝叶斯网络
《自然语言处理实战入门》深度学习基础 ---- attention 注意力机制 ,Transformer 深度解析与学习材料汇总
Record a failure caused by a custom redis distributed lock
点屏注意事项
How to switch IP and move bricks with mobile game simulator
android sqlite先分组后排序左连查询
【ICKIM 2022】第四届知识与信息管理国际会议
[ickim 2022] the Fourth International Conference on knowledge and information management
代理IP服务器如何保证自身在网络中的信息安全呢
第二届中国Rust开发者大会来啦,完整议程大曝光!