What do extra_free and extrastd_free params do in the dictionary object?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

What do extra_free and extrastd_free params do in the dictionary object?

Yaniv Kaul
I could not find a relevant use for them. Can anyone enlighten me?
TIA,
Y.

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: What do extra_free and extrastd_free params do in the dictionary object?

Xavi Hernandez
On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <[hidden email]> wrote:
I could not find a relevant use for them. Can anyone enlighten me?

I'm not sure why they are needed. They seem to be used to keep the unserialized version of a dict around until the dict is destroyed. I thought this could be because we were using pointers to the unserialized data inside dict, but that's not the case currently. However, checking very old versions (pre 3.2), I see that dict values were not allocated, but a pointer to the unserialized data was used.

I think this is not needed anymore. Probably we could remove these fields if that's the only reason.
TIA,
Y.
_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: What do extra_free and extrastd_free params do in the dictionary object?

Amar Tumballi-2


On Thu, Jan 9, 2020 at 1:38 PM Xavi Hernandez <[hidden email]> wrote:
On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <[hidden email]> wrote:
I could not find a relevant use for them. Can anyone enlighten me?

I'm not sure why they are needed. They seem to be used to keep the unserialized version of a dict around until the dict is destroyed. I thought this could be because we were using pointers to the unserialized data inside dict, but that's not the case currently. However, checking very old versions (pre 3.2), I see that dict values were not allocated, but a pointer to the unserialized data was used.

Xavi,

While you are right about the intent, it is used still, at least when I grepped latest repo to keep a reference in protocol layer. 

This is done to reduce a copy after the dictionary's binary content is received from RPC. The 'extra_free' flag is used when we use a GF_*ALLOC()'d buffer in protocol to receive dictionary, and extra_stdfree is used when RPC itself allocates the buffer and hence uses 'free()' to free the buffer.


I think this is not needed anymore. Probably we could remove these fields if that's the only reason.

If keeping them is hard to maintain, we can add few allocation to remove those elements, that shouldn't matter much IMO too. We are not using dictionary itself as protocol now (which we did in 1.x series though).

Regards,
Amar
---

 
TIA,
Y.
_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: What do extra_free and extrastd_free params do in the dictionary object?

Xavi Hernandez
On Thu, Jan 9, 2020 at 9:44 AM Amar Tumballi <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 1:38 PM Xavi Hernandez <[hidden email]> wrote:
On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <[hidden email]> wrote:
I could not find a relevant use for them. Can anyone enlighten me?

I'm not sure why they are needed. They seem to be used to keep the unserialized version of a dict around until the dict is destroyed. I thought this could be because we were using pointers to the unserialized data inside dict, but that's not the case currently. However, checking very old versions (pre 3.2), I see that dict values were not allocated, but a pointer to the unserialized data was used.

Xavi,

While you are right about the intent, it is used still, at least when I grepped latest repo to keep a reference in protocol layer. 

This is done to reduce a copy after the dictionary's binary content is received from RPC. The 'extra_free' flag is used when we use a GF_*ALLOC()'d buffer in protocol to receive dictionary, and extra_stdfree is used when RPC itself allocates the buffer and hence uses 'free()' to free the buffer.

I don't see it. When dict_unserialize() is called, key and value are allocated and copied, so  why do we need to keep the raw data after that ?

In 3.1 the value was simply a pointer to the unserialized data, but starting with 3.2, value is memdup'ed. Key is always copied. I don't see any other reference to the unserialized data right now. I think that instead of assigning the raw data to extra_(std)free, we should simply release that memory and remove those fields.

Am I missing something else ?



I think this is not needed anymore. Probably we could remove these fields if that's the only reason.

If keeping them is hard to maintain, we can add few allocation to remove those elements, that shouldn't matter much IMO too. We are not using dictionary itself as protocol now (which we did in 1.x series though).

Regards,
Amar
---

 
TIA,
Y.
_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: What do extra_free and extrastd_free params do in the dictionary object?

Amar Tumballi-2


On Thu, Jan 9, 2020 at 2:33 PM Xavi Hernandez <[hidden email]> wrote:
On Thu, Jan 9, 2020 at 9:44 AM Amar Tumballi <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 1:38 PM Xavi Hernandez <[hidden email]> wrote:
On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <[hidden email]> wrote:
I could not find a relevant use for them. Can anyone enlighten me?

I'm not sure why they are needed. They seem to be used to keep the unserialized version of a dict around until the dict is destroyed. I thought this could be because we were using pointers to the unserialized data inside dict, but that's not the case currently. However, checking very old versions (pre 3.2), I see that dict values were not allocated, but a pointer to the unserialized data was used.

Xavi,

While you are right about the intent, it is used still, at least when I grepped latest repo to keep a reference in protocol layer. 

This is done to reduce a copy after the dictionary's binary content is received from RPC. The 'extra_free' flag is used when we use a GF_*ALLOC()'d buffer in protocol to receive dictionary, and extra_stdfree is used when RPC itself allocates the buffer and hence uses 'free()' to free the buffer.

I don't see it. When dict_unserialize() is called, key and value are allocated and copied, so  why do we need to keep the raw data after that ?

In 3.1 the value was simply a pointer to the unserialized data, but starting with 3.2, value is memdup'ed. Key is always copied. I don't see any other reference to the unserialized data right now. I think that instead of assigning the raw data to extra_(std)free, we should simply release that memory and remove those fields.

Am I missing something else ?

I did grep on 'extra_stdfree' and 'extra_free' and saw that many handshake/ and protocol code seemed to use it. Haven't gone deeper to check which part.

[amar@kadalu glusterfs]$ git grep extra_stdfree | wc -l
40
[amar@kadalu glusterfs]$ git grep extra_free | wc -l
5
 



I think this is not needed anymore. Probably we could remove these fields if that's the only reason.

If keeping them is hard to maintain, we can add few allocation to remove those elements, that shouldn't matter much IMO too. We are not using dictionary itself as protocol now (which we did in 1.x series though).

Regards,
Amar
---

 
TIA,
Y.
_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: What do extra_free and extrastd_free params do in the dictionary object?

Xavi Hernandez
On Thu, Jan 9, 2020 at 10:22 AM Amar Tumballi <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 2:33 PM Xavi Hernandez <[hidden email]> wrote:
On Thu, Jan 9, 2020 at 9:44 AM Amar Tumballi <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 1:38 PM Xavi Hernandez <[hidden email]> wrote:
On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <[hidden email]> wrote:
I could not find a relevant use for them. Can anyone enlighten me?

I'm not sure why they are needed. They seem to be used to keep the unserialized version of a dict around until the dict is destroyed. I thought this could be because we were using pointers to the unserialized data inside dict, but that's not the case currently. However, checking very old versions (pre 3.2), I see that dict values were not allocated, but a pointer to the unserialized data was used.

Xavi,

While you are right about the intent, it is used still, at least when I grepped latest repo to keep a reference in protocol layer. 

This is done to reduce a copy after the dictionary's binary content is received from RPC. The 'extra_free' flag is used when we use a GF_*ALLOC()'d buffer in protocol to receive dictionary, and extra_stdfree is used when RPC itself allocates the buffer and hence uses 'free()' to free the buffer.

I don't see it. When dict_unserialize() is called, key and value are allocated and copied, so  why do we need to keep the raw data after that ?

In 3.1 the value was simply a pointer to the unserialized data, but starting with 3.2, value is memdup'ed. Key is always copied. I don't see any other reference to the unserialized data right now. I think that instead of assigning the raw data to extra_(std)free, we should simply release that memory and remove those fields.

Am I missing something else ?

I did grep on 'extra_stdfree' and 'extra_free' and saw that many handshake/ and protocol code seemed to use it. Haven't gone deeper to check which part.

[amar@kadalu glusterfs]$ git grep extra_stdfree | wc -l
40
[amar@kadalu glusterfs]$ git grep extra_free | wc -l
5

Yes, they call dict_unserialize() and then store the unserialized data into those variables. That's what I'm saying it's not necessary.

 



I think this is not needed anymore. Probably we could remove these fields if that's the only reason.

If keeping them is hard to maintain, we can add few allocation to remove those elements, that shouldn't matter much IMO too. We are not using dictionary itself as protocol now (which we did in 1.x series though).

Regards,
Amar
---

 
TIA,
Y.
_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: What do extra_free and extrastd_free params do in the dictionary object?

Yaniv Kaul


On Thu, Jan 9, 2020 at 11:35 AM Xavi Hernandez <[hidden email]> wrote:
On Thu, Jan 9, 2020 at 10:22 AM Amar Tumballi <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 2:33 PM Xavi Hernandez <[hidden email]> wrote:
On Thu, Jan 9, 2020 at 9:44 AM Amar Tumballi <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 1:38 PM Xavi Hernandez <[hidden email]> wrote:
On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <[hidden email]> wrote:
I could not find a relevant use for them. Can anyone enlighten me?

I'm not sure why they are needed. They seem to be used to keep the unserialized version of a dict around until the dict is destroyed. I thought this could be because we were using pointers to the unserialized data inside dict, but that's not the case currently. However, checking very old versions (pre 3.2), I see that dict values were not allocated, but a pointer to the unserialized data was used.

Xavi,

While you are right about the intent, it is used still, at least when I grepped latest repo to keep a reference in protocol layer. 

This is done to reduce a copy after the dictionary's binary content is received from RPC. The 'extra_free' flag is used when we use a GF_*ALLOC()'d buffer in protocol to receive dictionary, and extra_stdfree is used when RPC itself allocates the buffer and hence uses 'free()' to free the buffer.

I don't see it. When dict_unserialize() is called, key and value are allocated and copied, so  why do we need to keep the raw data after that ?

In 3.1 the value was simply a pointer to the unserialized data, but starting with 3.2, value is memdup'ed. Key is always copied. I don't see any other reference to the unserialized data right now. I think that instead of assigning the raw data to extra_(std)free, we should simply release that memory and remove those fields.

Am I missing something else ?

I did grep on 'extra_stdfree' and 'extra_free' and saw that many handshake/ and protocol code seemed to use it. Haven't gone deeper to check which part.

[amar@kadalu glusterfs]$ git grep extra_stdfree | wc -l
40
[amar@kadalu glusterfs]$ git grep extra_free | wc -l
5

Yes, they call dict_unserialize() and then store the unserialized data into those variables. That's what I'm saying it's not necessary.

In at least 2 cases, there's even something stranger I could not understand (see in bold - from server_setvolume() function) :
    params = dict_new();
    reply = dict_new();
    ret = xdr_to_generic(req->msg[0], &args, (xdrproc_t)xdr_gf_setvolume_req);
    if (ret < 0) {
        // failed to decode msg;
        req->rpc_err = GARBAGE_ARGS;
        goto fail;
    }
    ctx = THIS->ctx;

    this = req->svc->xl;
    /* this is to ensure config_params is populated with the first brick
     * details at first place if brick multiplexing is enabled
     */
    config_params = dict_copy_with_ref(this->options, NULL);

    buf = gf_memdup(args.dict.dict_val, args.dict.dict_len);
    if (buf == NULL) {
        op_ret = -1;
        op_errno = ENOMEM;
        goto fail;
    }

    ret = dict_unserialize(buf, args.dict.dict_len, &params);
    if (ret < 0) {
        ret = dict_set_str(reply, "ERROR",
                           "Internal error: failed to unserialize "
                           "request dictionary");
        if (ret < 0)
            gf_msg_debug(this->name, 0,
                         "failed to set error "
                         "msg \"%s\"",
                         "Internal error: failed "
                         "to unserialize request dictionary");

        op_ret = -1;
        op_errno = EINVAL;
        goto fail;
    }

    params->extra_free = buf;
    buf = NULL;

 



I think this is not needed anymore. Probably we could remove these fields if that's the only reason.

If keeping them is hard to maintain, we can add few allocation to remove those elements, that shouldn't matter much IMO too. We are not using dictionary itself as protocol now (which we did in 1.x series though).

Regards,
Amar
---

 
TIA,
Y.
_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: What do extra_free and extrastd_free params do in the dictionary object?

Xavi Hernandez
On Thu, Jan 9, 2020 at 11:11 AM Yaniv Kaul <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 11:35 AM Xavi Hernandez <[hidden email]> wrote:
On Thu, Jan 9, 2020 at 10:22 AM Amar Tumballi <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 2:33 PM Xavi Hernandez <[hidden email]> wrote:
On Thu, Jan 9, 2020 at 9:44 AM Amar Tumballi <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 1:38 PM Xavi Hernandez <[hidden email]> wrote:
On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <[hidden email]> wrote:
I could not find a relevant use for them. Can anyone enlighten me?

I'm not sure why they are needed. They seem to be used to keep the unserialized version of a dict around until the dict is destroyed. I thought this could be because we were using pointers to the unserialized data inside dict, but that's not the case currently. However, checking very old versions (pre 3.2), I see that dict values were not allocated, but a pointer to the unserialized data was used.

Xavi,

While you are right about the intent, it is used still, at least when I grepped latest repo to keep a reference in protocol layer. 

This is done to reduce a copy after the dictionary's binary content is received from RPC. The 'extra_free' flag is used when we use a GF_*ALLOC()'d buffer in protocol to receive dictionary, and extra_stdfree is used when RPC itself allocates the buffer and hence uses 'free()' to free the buffer.

I don't see it. When dict_unserialize() is called, key and value are allocated and copied, so  why do we need to keep the raw data after that ?

In 3.1 the value was simply a pointer to the unserialized data, but starting with 3.2, value is memdup'ed. Key is always copied. I don't see any other reference to the unserialized data right now. I think that instead of assigning the raw data to extra_(std)free, we should simply release that memory and remove those fields.

Am I missing something else ?

I did grep on 'extra_stdfree' and 'extra_free' and saw that many handshake/ and protocol code seemed to use it. Haven't gone deeper to check which part.

[amar@kadalu glusterfs]$ git grep extra_stdfree | wc -l
40
[amar@kadalu glusterfs]$ git grep extra_free | wc -l
5

Yes, they call dict_unserialize() and then store the unserialized data into those variables. That's what I'm saying it's not necessary.

In at least 2 cases, there's even something stranger I could not understand (see in bold - from server_setvolume() function) :
    params = dict_new();
    reply = dict_new();
    ret = xdr_to_generic(req->msg[0], &args, (xdrproc_t)xdr_gf_setvolume_req);
    if (ret < 0) {
        // failed to decode msg;
        req->rpc_err = GARBAGE_ARGS;
        goto fail;
    }
    ctx = THIS->ctx;

    this = req->svc->xl;
    /* this is to ensure config_params is populated with the first brick
     * details at first place if brick multiplexing is enabled
     */
    config_params = dict_copy_with_ref(this->options, NULL);

    buf = gf_memdup(args.dict.dict_val, args.dict.dict_len);

This is probably unnecessary if we can really remove extra_free. Probably it's here because args.dict.dict_val will be destroyed later.


    if (buf == NULL) {
        op_ret = -1;
        op_errno = ENOMEM;
        goto fail;
    }

    ret = dict_unserialize(buf, args.dict.dict_len, &params);
    if (ret < 0) {
        ret = dict_set_str(reply, "ERROR",
                           "Internal error: failed to unserialize "
                           "request dictionary");
        if (ret < 0)
            gf_msg_debug(this->name, 0,
                         "failed to set error "
                         "msg \"%s\"",
                         "Internal error: failed "
                         "to unserialize request dictionary");

        op_ret = -1;
        op_errno = EINVAL;
        goto fail;
    }

    params->extra_free = buf;
    buf = NULL;

This is needed to prevent memory pointed by buf from being destroyed once if has been assigned to extra_free. Also unnecessary if extra_free can be removed.

So we could eliminate one memory copy here if those fields are not really necessary.


 



I think this is not needed anymore. Probably we could remove these fields if that's the only reason.

If keeping them is hard to maintain, we can add few allocation to remove those elements, that shouldn't matter much IMO too. We are not using dictionary itself as protocol now (which we did in 1.x series though).

Regards,
Amar
---

 
TIA,
Y.
_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: What do extra_free and extrastd_free params do in the dictionary object?

Yaniv Kaul
To conclude and ensure I understood both:
- I've sent a patch to remove extra_free parameter (and the associated gf_memdup() in 2 places) - https://review.gluster.org/#/c/glusterfs/+/23999/ (waiting for CI to finish working on it).
- I'll send a patch to remove extra_stdfree - but this one is slightly bigger:
1. There are more places it's set (18).
2. Somehow, we need to release that memory. I kinda assume the sooner the better?
So perhaps I can just replace:
dict->extra_stdfree = cli_req.dict.dict_val;
with:
free(cli_req.dict.dict_val) ?
3. In some places, there was actually a call to GF_FREE, which is kind of confusing. For example, in __server_get_snap_info() :
    if (snap_info_rsp.dict.dict_val) {
        GF_FREE(snap_info_rsp.dict.dict_val);
    }

I think I should remove that and stick to freeing right after unserialization?
Y.



On Thu, Jan 9, 2020 at 12:42 PM Xavi Hernandez <[hidden email]> wrote:
On Thu, Jan 9, 2020 at 11:11 AM Yaniv Kaul <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 11:35 AM Xavi Hernandez <[hidden email]> wrote:
On Thu, Jan 9, 2020 at 10:22 AM Amar Tumballi <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 2:33 PM Xavi Hernandez <[hidden email]> wrote:
On Thu, Jan 9, 2020 at 9:44 AM Amar Tumballi <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 1:38 PM Xavi Hernandez <[hidden email]> wrote:
On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <[hidden email]> wrote:
I could not find a relevant use for them. Can anyone enlighten me?

I'm not sure why they are needed. They seem to be used to keep the unserialized version of a dict around until the dict is destroyed. I thought this could be because we were using pointers to the unserialized data inside dict, but that's not the case currently. However, checking very old versions (pre 3.2), I see that dict values were not allocated, but a pointer to the unserialized data was used.

Xavi,

While you are right about the intent, it is used still, at least when I grepped latest repo to keep a reference in protocol layer. 

This is done to reduce a copy after the dictionary's binary content is received from RPC. The 'extra_free' flag is used when we use a GF_*ALLOC()'d buffer in protocol to receive dictionary, and extra_stdfree is used when RPC itself allocates the buffer and hence uses 'free()' to free the buffer.

I don't see it. When dict_unserialize() is called, key and value are allocated and copied, so  why do we need to keep the raw data after that ?

In 3.1 the value was simply a pointer to the unserialized data, but starting with 3.2, value is memdup'ed. Key is always copied. I don't see any other reference to the unserialized data right now. I think that instead of assigning the raw data to extra_(std)free, we should simply release that memory and remove those fields.

Am I missing something else ?

I did grep on 'extra_stdfree' and 'extra_free' and saw that many handshake/ and protocol code seemed to use it. Haven't gone deeper to check which part.

[amar@kadalu glusterfs]$ git grep extra_stdfree | wc -l
40
[amar@kadalu glusterfs]$ git grep extra_free | wc -l
5

Yes, they call dict_unserialize() and then store the unserialized data into those variables. That's what I'm saying it's not necessary.

In at least 2 cases, there's even something stranger I could not understand (see in bold - from server_setvolume() function) :
    params = dict_new();
    reply = dict_new();
    ret = xdr_to_generic(req->msg[0], &args, (xdrproc_t)xdr_gf_setvolume_req);
    if (ret < 0) {
        // failed to decode msg;
        req->rpc_err = GARBAGE_ARGS;
        goto fail;
    }
    ctx = THIS->ctx;

    this = req->svc->xl;
    /* this is to ensure config_params is populated with the first brick
     * details at first place if brick multiplexing is enabled
     */
    config_params = dict_copy_with_ref(this->options, NULL);

    buf = gf_memdup(args.dict.dict_val, args.dict.dict_len);

This is probably unnecessary if we can really remove extra_free. Probably it's here because args.dict.dict_val will be destroyed later.


    if (buf == NULL) {
        op_ret = -1;
        op_errno = ENOMEM;
        goto fail;
    }

    ret = dict_unserialize(buf, args.dict.dict_len, &params);
    if (ret < 0) {
        ret = dict_set_str(reply, "ERROR",
                           "Internal error: failed to unserialize "
                           "request dictionary");
        if (ret < 0)
            gf_msg_debug(this->name, 0,
                         "failed to set error "
                         "msg \"%s\"",
                         "Internal error: failed "
                         "to unserialize request dictionary");

        op_ret = -1;
        op_errno = EINVAL;
        goto fail;
    }

    params->extra_free = buf;
    buf = NULL;

This is needed to prevent memory pointed by buf from being destroyed once if has been assigned to extra_free. Also unnecessary if extra_free can be removed.

So we could eliminate one memory copy here if those fields are not really necessary.


 



I think this is not needed anymore. Probably we could remove these fields if that's the only reason.

If keeping them is hard to maintain, we can add few allocation to remove those elements, that shouldn't matter much IMO too. We are not using dictionary itself as protocol now (which we did in 1.x series though).

Regards,
Amar
---

 
TIA,
Y.
_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

Reply | Threaded
Open this post in threaded view
|

Re: What do extra_free and extrastd_free params do in the dictionary object?

Xavi Hernandez
On Mon, Jan 13, 2020 at 9:41 AM Yaniv Kaul <[hidden email]> wrote:
To conclude and ensure I understood both:
- I've sent a patch to remove extra_free parameter (and the associated gf_memdup() in 2 places) - https://review.gluster.org/#/c/glusterfs/+/23999/ (waiting for CI to finish working on it).
- I'll send a patch to remove extra_stdfree - but this one is slightly bigger:
1. There are more places it's set (18).
2. Somehow, we need to release that memory. I kinda assume the sooner the better?
So perhaps I can just replace:
dict->extra_stdfree = cli_req.dict.dict_val;
with:
free(cli_req.dict.dict_val) ?

Yes. I think this is the best and easiest approach.
 
3. In some places, there was actually a call to GF_FREE, which is kind of confusing. For example, in __server_get_snap_info() :
    if (snap_info_rsp.dict.dict_val) {
        GF_FREE(snap_info_rsp.dict.dict_val);
    }

This seems like a bug. Additionally, this memory should be released using free() instead of GF_FREE().
 

I think I should remove that and stick to freeing right after unserialization?

Yes. I agree.

Regards,

Xavi



On Thu, Jan 9, 2020 at 12:42 PM Xavi Hernandez <[hidden email]> wrote:
On Thu, Jan 9, 2020 at 11:11 AM Yaniv Kaul <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 11:35 AM Xavi Hernandez <[hidden email]> wrote:
On Thu, Jan 9, 2020 at 10:22 AM Amar Tumballi <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 2:33 PM Xavi Hernandez <[hidden email]> wrote:
On Thu, Jan 9, 2020 at 9:44 AM Amar Tumballi <[hidden email]> wrote:


On Thu, Jan 9, 2020 at 1:38 PM Xavi Hernandez <[hidden email]> wrote:
On Sun, Dec 22, 2019 at 4:56 PM Yaniv Kaul <[hidden email]> wrote:
I could not find a relevant use for them. Can anyone enlighten me?

I'm not sure why they are needed. They seem to be used to keep the unserialized version of a dict around until the dict is destroyed. I thought this could be because we were using pointers to the unserialized data inside dict, but that's not the case currently. However, checking very old versions (pre 3.2), I see that dict values were not allocated, but a pointer to the unserialized data was used.

Xavi,

While you are right about the intent, it is used still, at least when I grepped latest repo to keep a reference in protocol layer. 

This is done to reduce a copy after the dictionary's binary content is received from RPC. The 'extra_free' flag is used when we use a GF_*ALLOC()'d buffer in protocol to receive dictionary, and extra_stdfree is used when RPC itself allocates the buffer and hence uses 'free()' to free the buffer.

I don't see it. When dict_unserialize() is called, key and value are allocated and copied, so  why do we need to keep the raw data after that ?

In 3.1 the value was simply a pointer to the unserialized data, but starting with 3.2, value is memdup'ed. Key is always copied. I don't see any other reference to the unserialized data right now. I think that instead of assigning the raw data to extra_(std)free, we should simply release that memory and remove those fields.

Am I missing something else ?

I did grep on 'extra_stdfree' and 'extra_free' and saw that many handshake/ and protocol code seemed to use it. Haven't gone deeper to check which part.

[amar@kadalu glusterfs]$ git grep extra_stdfree | wc -l
40
[amar@kadalu glusterfs]$ git grep extra_free | wc -l
5

Yes, they call dict_unserialize() and then store the unserialized data into those variables. That's what I'm saying it's not necessary.

In at least 2 cases, there's even something stranger I could not understand (see in bold - from server_setvolume() function) :
    params = dict_new();
    reply = dict_new();
    ret = xdr_to_generic(req->msg[0], &args, (xdrproc_t)xdr_gf_setvolume_req);
    if (ret < 0) {
        // failed to decode msg;
        req->rpc_err = GARBAGE_ARGS;
        goto fail;
    }
    ctx = THIS->ctx;

    this = req->svc->xl;
    /* this is to ensure config_params is populated with the first brick
     * details at first place if brick multiplexing is enabled
     */
    config_params = dict_copy_with_ref(this->options, NULL);

    buf = gf_memdup(args.dict.dict_val, args.dict.dict_len);

This is probably unnecessary if we can really remove extra_free. Probably it's here because args.dict.dict_val will be destroyed later.


    if (buf == NULL) {
        op_ret = -1;
        op_errno = ENOMEM;
        goto fail;
    }

    ret = dict_unserialize(buf, args.dict.dict_len, &params);
    if (ret < 0) {
        ret = dict_set_str(reply, "ERROR",
                           "Internal error: failed to unserialize "
                           "request dictionary");
        if (ret < 0)
            gf_msg_debug(this->name, 0,
                         "failed to set error "
                         "msg \"%s\"",
                         "Internal error: failed "
                         "to unserialize request dictionary");

        op_ret = -1;
        op_errno = EINVAL;
        goto fail;
    }

    params->extra_free = buf;
    buf = NULL;

This is needed to prevent memory pointed by buf from being destroyed once if has been assigned to extra_free. Also unnecessary if extra_free can be removed.

So we could eliminate one memory copy here if those fields are not really necessary.


 



I think this is not needed anymore. Probably we could remove these fields if that's the only reason.

If keeping them is hard to maintain, we can add few allocation to remove those elements, that shouldn't matter much IMO too. We are not using dictionary itself as protocol now (which we did in 1.x series though).

Regards,
Amar
---

 
TIA,
Y.
_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968


NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-devel mailing list
[hidden email]
https://lists.gluster.org/mailman/listinfo/gluster-devel