Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
327 views
in Technique[技术] by (71.8m points)

c - Create customized header (metadata) for files

Here I want to create a header which contains other file details like metadata of other files.

This code is works fine if I use static values for struct file_header. If I am using malloc for struct file_header then I am getting a problem in this code. Specifically, I'm getting a problem in fread. Maybe fwrite worked perfectly. The code is here:

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <dirent.h>
#include <string.h>

char path[1024] = "/home/test/main/Integration/testing/package_DIR";

//int count = 5;

struct files {

    char *file_name;
    int file_size;
};

typedef struct file_header {

    int file_count;
    struct files file[5];
} metadata;


metadata *create_header();

int main() {
    FILE *file = fopen("/home/test/main/Integration/testing/file.txt", "w");
    metadata *header;
    header = create_header();
    if(header != NULL)
    {
        printf("size of Header is %d
",sizeof(header));
    }

    if (file != NULL) {

        if (fwrite(&header, sizeof(header), 1, file) < 1) {
            puts("short count on fwrite");
        }
        fclose(file);
    }
    file = fopen("/home/test/main/Integration/testing/file.txt", "rb");
    if (file != NULL) {
        metadata header = { 0 };
        if (fread(&header, sizeof(header), 1, file) < 1) {
            puts("short count on fread");
        }
        fclose(file);
        printf("File Name = %s
", header.file[0].file_name);
        printf("File count = %d
", header.file_count);
        printf("File Size = %d
", header.file[0].file_size);
    }
    return 0;
}

metadata *create_header()
{
    int file_count = 0;
    DIR * dirp;
    struct dirent * entry;
    dirp = opendir(path);
    metadata *header = (metadata *)malloc(sizeof(metadata));
    while ((entry = readdir(dirp)) != NULL) {
        if (entry->d_type == DT_REG) { /* If the entry is a regular file */

            header->file[file_count].file_name = (char *)malloc(sizeof(char)*strlen(entry->d_name));
            strcpy(header->file[file_count].file_name,entry->d_name);
            //Put static but i have logic for this i will apply later.
            header->file[file_count].file_size = 10;
            file_count++;

        }
    }
    header->file_count = file_count;
    closedir(dirp);
    //printf("File Count : %d
", file_count);
    return header;
}

Output:

size of Header is 8
short count on fread
File Name = (null)
File count = 21918336
File Size = 0

Can anybody please help me to solve this issue?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You're working on a 64-bit machine because your pointers are 8 bytes long.

You're trying to write data out to a file and then read it back in. You're running into problems because pointers don't write very well. (More precisely: pointers can be written without any problems, but the pointers only have meaning in the current running program, and are seldom suitable for writing to disk and even more seldom suitable for reading back from disk.)

This part of your code illustrates the problem:

struct files {
    char *file_name;
    int file_size;
};

typedef struct file_header {
    int file_count;
    struct files file[5];
} metadata;


metadata *create_header();

int main() {
    FILE *file = fopen("/home/test/main/Integration/testing/file.txt", "w");
    metadata *header;
    header = create_header();
    if(header != NULL)
    {
        printf("size of Header is %d
",sizeof(header));
    }

Side comments:

  • Make the file name into an argument to main(), or, at least, into a variable. Writing the name out twice makes it hard to change.
  • It is good that you are doing some error detection. However, I'm not going to critique it, though there is considerable room for improvement in it.

Main comments:

  • You see size of Header is 8 in the output because header is a pointer. The sizeof(metadata) (the type that header points to) is much larger, probably 48 bytes, but it depends on how your compiler aligns and packs data in the structure.

    if (file != NULL) {    
        if (fwrite(&header, sizeof(header), 1, file) < 1) {
            puts("short count on fwrite");
        }
        fclose(file);
    }
    

This code writes 8 bytes of data to the file. What it writes is the address where your header variable is stored. It does not write anything of the data that it points at.

What would get closer to what you are after (but is still not going to work) is:

        if (fwrite(header, sizeof(*header), 1, file) < 1) {
            puts("short count on fwrite");
        }

This will write 48 bytes or thereabouts out to file. However, your file won't contain the file names; it will merely contain the pointers to where the file names were stored at the time when the file was written. Be very careful here. If you read this file, you might even see it appearing to work because the names may not have been erased from memory yet.

To get the file names into the file, you will have to handle each one separately. You'll have to decide on a convention. For example, you might decide that the name will be prefixed by a 2-byte unsigned short which contains the length of the file name, L, followed by L+1 bytes of data containing the file name and its terminal NUL ''. Then you'd write the other (fixed size) parts of the per-file data. And you'd repeat this process for each of the files. The converse operation, reading the file, requires understanding of the written structure. At the point where you expect a file name, you'll read the two-byte length, and you can use that length to allocate the space for the file name. Then you read the L+1 bytes into the newly allocated file name. Then you read the other fixed-length data for the file, then move onto the next file.

If you want to be able to do it all in a single fwrite() and then fread(), you are going to have to revise your data structure:

struct files {
    char  file_name[MAX_PERMITTED_FILENAME_LENGTH];
    int   file_size;
};

You get to decide what the maximum permitted filename length is, but it is fixed. If your names are short, you don't use all the space; if your names are long, they may be truncated. Your metadata structure size now increases dramatically (at least if MAX_PERMITTED_FILENAME_LENGTH is a reasonable size, say between 32 and 1024 bytes). But you can read and write the entire metadata structure in a single operation with this.


Thanks for your reply but I am new in C so how can I achieve this thing?

Eventually, you'll be able to code it somewhat like this.

#include <dirent.h>
#include <errno.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

enum { MAX_FILES = 5 };

struct files
{
    char *file_name;
    int file_size;
};

typedef struct file_header
{
    int file_count;
    struct files file[MAX_FILES];
} metadata;

static void err_exit(const char *format, ...);
static metadata *create_header(const char *directory);
static void release_header(metadata *header);
static void write_header(FILE *fp, const metadata *header);
static metadata *read_header(FILE *fp);
static void dump_header(FILE *fp, const char *tag, const metadata *header);

int main(int argc, char **argv)
{
    if (argc != 3)
        err_exit("Usage: %s file directory
", argv[0]);

    const char *name = argv[1];
    const char *path = argv[2];
    FILE *fp = fopen(name, "wb");

    if (fp == 0)
        err_exit("Failed to open file %s for writing (%d: %s)
", name, errno, strerror(errno));

    metadata *header = create_header(path);
    dump_header(stdout, "Data to be written", header);
    write_header(fp, header);
    fclose(fp);                     // Ignore error on close
    release_header(header);

    if ((fp = fopen(name, "rb")) == 0)
        err_exit("Failed to open file %s for reading (%d: %s)
", name, errno, strerror(errno));

    metadata *read_info = read_header(fp);
    dump_header(stdout, "Data as read", read_info);
    release_header(read_info);

    fclose(fp);                     // Ignore error on close
    return 0;
}

static metadata *create_header(const char *path)
{
    int file_count = 0;
    DIR * dirp = opendir(path);
    struct dirent * entry;
    if (dirp == 0)
        err_exit("Failed to open directory %s (%d: %s)
", path, errno, strerror(errno));
    metadata *header = (metadata *)malloc(sizeof(metadata));
    if (header == 0)
        err_exit("Failed to malloc space for header (%d: %s)
", errno, strerror(errno));

    header->file_count = 0;
    while ((entry = readdir(dirp)) != NULL && file_count < MAX_FILES)
    {
        // d_type is not portable - POSIX says you can only rely on d_name and d_ino
        if (entry->d_type == DT_REG)
        {   /* If the entry is a regular file */
            // Avoid off-by-one under-allocation by using strdup()
            header->file[file_count].file_name = strdup(entry->d_name);
            if (header->file[file_count].file_name == 0)
                err_exit("Failed to strdup() file %s (%d: %s)
", entry->d_name, errno, strerror(errno));
            //Put static but i have logic for this i will apply later.
            header->file[file_count].file_size = 10;
            file_count++;
        }
    }
    header->file_count = file_count;
    closedir(dirp);
    //printf("File Count : %d
", file_count);
    return header;
}

static void write_header(FILE *fp, const metadata *header)
{
    if (fwrite(&header->file_count, sizeof(header->file_count), 1, fp) != 1)
        err_exit("Write error on file count (%d: %s)
", errno, strerror(errno));
    const struct files *files = header->file;
    for (int i = 0; i < header->file_count; i++)
    {
        unsigned short name_len = strlen(files[i].file_name) + 1;
        if (fwrite(&name_len, sizeof(name_len), 1, fp) != 1)
            err_exit("Write error on file name length (%d: %s)
", errno, strerror(errno));
        if (fwrite(files[i].file_name, name_len, 1, fp) != 1)
            err_exit("Write error on file name (%d: %s)
", errno, strerror(errno));
        if (fwrite(&files[i].file_size, sizeof(files[i].file_size), 1, fp) != 1)
            err_exit("Write error on file size (%d: %s)
", errno, strerror(errno));
    }
}

static metadata *read_header(FILE *fp)
{
    metadata *header = malloc(sizeof(*header));
    if (header == 0)
        err_exit("Failed to malloc space for header (%d:%s)
", errno, strerror(errno));
    if (fread(&header->file_count, sizeof(header->file_count), 1, fp) != 1)
        err_exit("Read error on file count (%d: %s)
", errno, strerror(errno));
    struct files *files = header->file;
    for (int i = 0; i < header->file_count; i++)
    {
        unsigned short name_len;
        if (fread(&name_len, sizeof(name_len), 1, fp) != 1)
            err_exit("Read error on file name length (%d: %s)
", errno, strerror(errno));
        files[i].file_name = malloc(name_len);
        if (files[i].file_name == 0)
            err_exit("Failed to malloc space for file name (%d:%s)
", errno, strerror(errno));
        if (fread(files[i].file_name, name_len, 1, fp) != 1)
            err_exit("Read error on file name (%d: %s)
", errno, strerror(errno));
        if (fread(&files[i].file_size, sizeof(files[i].file_size), 1, fp) != 1)
            err_exit("Read error on file size (%d: %s)
", errno, strerror(errno));
    }
    return(header);
}

static void dump_header(FILE *fp, const char *tag, const metadata *header)
{
    const struct files *files = header->file;
    fprintf(fp, "Metadata: %s
", tag);
    fprintf(fp, "File count: %d
", header->file_count);
    for (int i = 0; i < header->file_count; i++)
        fprintf(fp, "File %d: size %5d, name %s
", i, files[i].file_size, files[i].file_name);
}

static void release_header(metadata *header)
{
    for (int i = 0; i < header->file_count; i++)
    {
        /* Zap file name, and pointer to file name */
        memset(header->file[i].file_name, 0xDD, strlen(header->file[i].file_name)+1);
        free(header->file[i].file_name);
        memset(&header->file[i].file_name, 0xEE, sizeof(header->file[i].file_name));
    }
    free(header);
}

static void err_exit(const char *format, ...)
{
    va_list args;
    va_start(args, format);
    vfprintf(stderr, format, args);
    va_end(args);
    exit(EXIT_FAILURE);
}

I compiled it as dump_file, and ran it as shown:

$ dump_file xyz .
Metadata: Data to be written
File count: 5
File 0: size    10, name .gitignore
File 1: size    10, name args.c
File 2: size    10, name atob.c
File 3: size    10, name bp.pl
File 4: size    10, name btwoc.c
Metadata: Data as read
File count: 5
File 0: size    10, name .gitignore
File 1: size    10, name args.c
File 2: size    10, name atob.c
File 3: size    10, name bp.pl
File 4: size    10, name btwoc.c
$ odx xyz
0x0000: 05 00 00 00 0B 00 2E 67 69 74 69 67 6E 6F 72 65   .......gitignore
0x0010: 00 0A 00 00 00 07 00 61 72 67 73 2E 63 00 0A 00   .......args.c...
0x0020: 00 00 07 00 61 74 6F 62 2E 63 00 0A 00 00 00 06   ....atob.c......
0x0030: 00 62 70 2E 70 6C 00 0A 00 00 00 08 00 62 74 77   .bp.pl.......btw
0x0040: 6F 63 2E 63 00 0A 00 00 00                        oc.c.....
0x0049:
$

I should probably have renamed err_exit() as err_sysexit() and revised the error handling so that errno and the corresponding string were handled inside that function, rather than repeatedly adding errno and strerror(errno) to the calls to err_exit().


Information from comments

Transferring some of the rather extensive commentary into the question:


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...